﻿1
00:00:00,910 --> 00:00:03,980
‫000Till Now, we have discussed the individual cell.

2
00:00:04,950 --> 00:00:09,510
‫Now we are going to stack these cells to create network of cells.

3
00:00:11,450 --> 00:00:13,980
‫Just to avoid confusion with biological neuron.

4
00:00:14,720 --> 00:00:18,020
‫I'll be calling a neutron as Perceptrons only.

5
00:00:18,590 --> 00:00:19,580
‫So Perceptron.

6
00:00:19,670 --> 00:00:22,490
‫From now on means any artificial neutron.

7
00:00:24,370 --> 00:00:30,220
‫Now, there are two ways we can stack sets parallely or sequentially.

8
00:00:32,170 --> 00:00:35,040
‫Let's see what happens when we stack cells parallely

9
00:00:37,610 --> 00:00:41,920
‫Head is a single Perceptron, the three inputs and one output.

10
00:00:43,430 --> 00:00:46,480
‫No, we add another Perceptron parallel to it.

11
00:00:47,670 --> 00:00:52,940
‫This cell also gets the same three input, but it has a different output.

12
00:00:53,210 --> 00:00:53,720
‫Y2

13
00:00:55,370 --> 00:01:02,860
‫We can keep on adding more cells parallel to these ones, maybe a third or fourth or even more than that, we'll

14
00:01:02,900 --> 00:01:05,300
‫just start getting new outputs.

15
00:01:05,870 --> 00:01:11,640
‫Or in other words, we can predict for multiple output using the same input features.

16
00:01:13,610 --> 00:01:20,090
‫For example, when we are doing image recognition and we are trying to find out a face of a person,

17
00:01:21,020 --> 00:01:26,060
‫We may also want to find the X and Y coordinate of that face in the photo

18
00:01:27,390 --> 00:01:30,760
‫So these two variables will become y2 and y3

19
00:01:32,700 --> 00:01:38,190
‫Although image recognition needs a much more complex network by giving this example.

20
00:01:38,430 --> 00:01:43,390
‫I want to make the point that neural networks are not bound to only one output.

21
00:01:44,810 --> 00:01:51,470
‫With the same input, you can get multiple output because we can do parallel stacking of the artificial

22
00:01:51,470 --> 00:01:51,890
‫neurons.

23
00:01:54,760 --> 00:01:55,640
‫Now, let's see.

24
00:01:55,960 --> 00:01:57,100
‫Sequential stacking.

25
00:01:59,420 --> 00:02:04,880
‫In the image above, we have five inputs which we input to three parallel Perceptrons.

26
00:02:06,490 --> 00:02:14,490
‫Now, the output from this set of Perceptron is taken and fred as input to another set of parallel Perceptron

27
00:02:15,710 --> 00:02:15,970
‫Here

28
00:02:17,130 --> 00:02:21,570
‫I'm inputting the output of these three to these four Perceptrons.

29
00:02:23,630 --> 00:02:29,630
‫Again, I take the four ourputs of these Perceptron and input these into this single Perceptron.

30
00:02:31,020 --> 00:02:37,890
‫Lastly, this single cell is giving out one single output, which is the variable which we want to predict.

31
00:02:41,440 --> 00:02:50,170
‫So this is sequential stacking in which the output of one set of parallely stack neurons is sequentially

32
00:02:50,200 --> 00:02:54,240
‫given as input to the next set of parallely stacked neurons.

33
00:02:56,640 --> 00:02:59,370
‫Let's first understand the benefit of doing this.

34
00:03:00,530 --> 00:03:09,330
‫That is why did we not just input all the five input into a single cell and use this output to predict

35
00:03:09,330 --> 00:03:10,110
‫the variable y.

36
00:03:11,270 --> 00:03:14,540
‫How is stacking these additional sets of neuron helpful?

37
00:03:16,840 --> 00:03:18,670
‫Let's say we have this type of data.

38
00:03:20,310 --> 00:03:22,650
‫There are these two input variables.

39
00:03:23,800 --> 00:03:25,240
‫Maybe height and weight.

40
00:03:25,780 --> 00:03:27,990
‫Bases which we are trying to classify.

41
00:03:28,660 --> 00:03:31,780
‫If the animal in the room is a cow or a dog.

42
00:03:33,100 --> 00:03:35,000
‫So cows generally lie here.

43
00:03:35,430 --> 00:03:37,950
‫They have more weight and more height than a dog.

44
00:03:38,640 --> 00:03:40,760
‫And dogs generally lie here.

45
00:03:41,160 --> 00:03:43,560
‫That is they are represented by the red dots.

46
00:03:45,120 --> 00:03:52,700
‫Now, when we're classifying this set of data, we can have a linear separator that is a straight line

47
00:03:53,080 --> 00:03:54,880
‫to separate these two classes.

48
00:03:56,720 --> 00:04:00,400
‫Anything on the right side will be predicted as a cow.

49
00:04:00,720 --> 00:04:03,980
‫And anything on the left side will be predicted as a dog.

50
00:04:05,010 --> 00:04:08,370
‫This is the capability of a single Perceptron.

51
00:04:09,550 --> 00:04:15,710
‫A single Perceptron can find out the best straight line to classify given data.

52
00:04:17,500 --> 00:04:19,090
‫So if we have this problem.

53
00:04:20,170 --> 00:04:22,910
‫Using a single Perceptron would suffice.

54
00:04:24,340 --> 00:04:26,560
‫But what if the situation is more complex.

55
00:04:27,400 --> 00:04:34,400
‫In fact, in real life situations, we never use neural networks when we'd need to classify for a situation

56
00:04:34,480 --> 00:04:35,500
‫As simple as this.

57
00:04:36,640 --> 00:04:40,890
‫The real life situations for neural networks is often more complex.

58
00:04:43,000 --> 00:04:45,390
‫Let me complicate the example a little bit.

59
00:04:47,060 --> 00:04:51,180
‫What if we wanted to classify objects which have this distribution?

60
00:04:52,180 --> 00:04:59,110
‫So anything to the left of the first line and anything to the right of the second lane is Class a

61
00:04:59,850 --> 00:05:01,180
‫or is a red dot

62
00:05:02,320 --> 00:05:07,160
‫And anything in between these two lines is class b or is a green dot.

63
00:05:08,930 --> 00:05:13,890
‫This type of classification situation can not be handled by a single Perceptron.

64
00:05:14,900 --> 00:05:19,820
‫A network such as the one shown on the right can easily handle it.

65
00:05:21,230 --> 00:05:21,890
‫For example.

66
00:05:23,100 --> 00:05:31,260
‫This first neuron will fire, that is give output as one when the point lies to the left of line one.

67
00:05:32,660 --> 00:05:38,540
‫And this second neuron will give output as one when the point lies to the right of line two.

68
00:05:40,110 --> 00:05:46,110
‫And just final neuron, gives output as one when any one of the two inputs is one.

69
00:05:48,450 --> 00:05:49,660
‫You can pause the video here.

70
00:05:50,370 --> 00:05:56,180
‫Think about it for a couple of minutes and see how the small network is handling this special classification.

71
00:05:59,230 --> 00:06:01,330
‫This is the power of a neural network.

72
00:06:02,900 --> 00:06:10,540
‫In the network we created, each neutron can focus on a particular feature of the object and not on

73
00:06:10,540 --> 00:06:11,860
‫the final output.

74
00:06:12,990 --> 00:06:15,210
‫The final output will be predicted.

75
00:06:15,770 --> 00:06:17,820
‫Bases the results of these features.

76
00:06:19,520 --> 00:06:25,990
‫In this way, neural networks can do really sophisticated decision making with basic machine learning

77
00:06:25,990 --> 00:06:29,890
‫techniques such as linear regression did not do with good accuracy.

78
00:06:32,660 --> 00:06:37,760
‫Before we move on, let's take a minute to discuss this network's nomenclature.

79
00:06:39,850 --> 00:06:41,200
‫This is a neural network.

80
00:06:42,570 --> 00:06:46,800
‫Now, each set of parallel neurons are called layers.

81
00:06:48,690 --> 00:06:50,460
‫The first is the input layer.

82
00:06:51,830 --> 00:06:53,540
‫The last is the output layer.

83
00:06:54,260 --> 00:06:56,780
‫And these in-between are the hidden layers

84
00:06:57,740 --> 00:06:59,750
‫This network had five inputs.

85
00:07:00,660 --> 00:07:07,380
‫Three cells in hidden layer 1, four in hidden layer 2 and one in the output layer.

86
00:07:08,590 --> 00:07:14,800
‫So for brevity, this network can also be called as a five, three, four, one network

87
00:07:18,630 --> 00:07:25,680
‫Also notice that the process information in this network is flowing in only the forward direction.

88
00:07:27,370 --> 00:07:28,980
‫Which is why such a network

89
00:07:29,170 --> 00:07:30,890
‫Is also called a feed forward

90
00:07:31,030 --> 00:07:31,510
‫Network.

91
00:07:33,090 --> 00:07:40,140
‫In comparison, if the output of one of the cells of that layer goes back as input to another cell of

92
00:07:40,140 --> 00:07:43,400
‫that layer only, then it is called a cyclic network.

93
00:07:45,400 --> 00:07:46,770
‫Recurrent neural networks

94
00:07:46,930 --> 00:07:50,000
‫Also known as rnn, are example of cyclic Network.

95
00:07:51,510 --> 00:07:55,220
‫Rnn are used in natural language processing and language modeling.

96
00:07:56,000 --> 00:07:58,770
‫For now, let's come back to a standard feel forward

97
00:07:58,880 --> 00:07:59,270
‫Network

98
00:08:00,620 --> 00:08:03,740
‫Now, you can notice here that output from this cell

99
00:08:05,840 --> 00:08:07,690
‫It's going to four cells

100
00:08:09,360 --> 00:08:11,240
‫These are not four different outputs.

101
00:08:11,700 --> 00:08:13,020
‫It is only one output.

102
00:08:13,320 --> 00:08:16,980
‫The same output is going as input and all these four cells

103
00:08:18,830 --> 00:08:27,170
‫Also note that every neuron in each layer, is connected to every other neuron in the adjacent forward

104
00:08:27,190 --> 00:08:27,490
‫Layer.

105
00:08:28,930 --> 00:08:31,480
‫Therefore, this network is fully connected.

106
00:08:33,120 --> 00:08:37,650
‫If somehow some links were missing, we call it partially connected.

107
00:08:38,910 --> 00:08:42,540
‫But for most practical purposes, we use a fully connected network.

108
00:08:45,680 --> 00:08:51,530
‫Before I close this lecture, I would like to tell you that within this short span of time in which

109
00:08:51,530 --> 00:08:55,720
‫we covered this lecture, we have entered the world of deep learning.

110
00:08:56,930 --> 00:09:01,030
‫Such artificial neural networks is what deep learning is made up of.

111
00:09:02,130 --> 00:09:09,320
‫Basically, think of this like a system which learns the relationship between input and output.

112
00:09:10,850 --> 00:09:19,190
‫The more layers we have in the system, the more deep our system is and more it is capable of establishing

113
00:09:19,310 --> 00:09:22,280
‫complex relationship between input and output.

114
00:09:24,950 --> 00:09:28,850
‫So I hope that you understand the basics of neural networks.

115
00:09:30,010 --> 00:09:37,600
‫In the next lecture will go deeper and see how these networks process the output and find the optimum

116
00:09:37,810 --> 00:09:43,420
‫values of weights and biases to get good accuracy of prediction.

117
00:09:44,890 --> 00:09:45,890
‫See you in the next lecture.

