﻿1
00:00:01,340 --> 00:00:11,660
‫In the last couple of videos we learn how to create neural network with get us sequential EPA the sequential

2
00:00:11,680 --> 00:00:20,920
‫EPA was quite easy to use but it have some limitation with sequential EPA.

3
00:00:21,200 --> 00:00:27,890
‫You can only create neural network with simply sequential architecture.

4
00:00:28,130 --> 00:00:37,650
‫You cannot create complex typologies like this where you have multiple inputs or multiple outputs.

5
00:00:38,180 --> 00:00:47,180
‫For that you have to use functional yet and functionally appear We create each of these layers in the

6
00:00:47,180 --> 00:00:55,430
‫form of functions or you can see a building block of your neural network and you can use this functions

7
00:00:55,820 --> 00:01:05,580
‫to create a complex structure by joining them according to your structure need before going into details.

8
00:01:05,580 --> 00:01:11,580
‫Let's first build the model that we created in our last lecture.

9
00:01:11,970 --> 00:01:14,120
‫Just very well and you're more than.

10
00:01:14,730 --> 00:01:22,170
‫And then you will also need to clear the session off your gave us this will free up the resources for

11
00:01:22,170 --> 00:01:25,680
‫our next model training.

12
00:01:25,680 --> 00:01:29,400
‫So just start get US DoD backing not only of session

13
00:01:32,460 --> 00:01:42,180
‫run this to now we are ready to create and train our next model for functional API we will be using

14
00:01:42,180 --> 00:01:52,370
‫this example this kind of neural network is known as vide and deep neural network deep because of what

15
00:01:52,460 --> 00:02:00,890
‫input is going through to layers of dense hidden layer and wide because of what input is directly going

16
00:02:01,220 --> 00:02:03,790
‫for output as well.

17
00:02:04,040 --> 00:02:13,610
‫So along with the output of this hidden layers it also connect all parts of our input directly to the

18
00:02:13,610 --> 00:02:14,360
‫output layer

19
00:02:17,070 --> 00:02:24,720
‫this linkage which I have my best right is not possible when we are using sequential API

20
00:02:27,640 --> 00:02:35,050
‫the advantage of this architecture is that it makes possible for a neural network to learn both the

21
00:02:35,050 --> 00:02:47,220
‫deep patterns by this deep linkage and the simple rules by this wide linkage in our regular MLP more

22
00:02:47,220 --> 00:02:55,530
‫than all the data flows through this full stacks of dense layers and thus some simple patterns in their

23
00:02:55,530 --> 00:03:03,730
‫data may end up being distorted by the sequence of this transformation.

24
00:03:03,800 --> 00:03:17,770
‫So now let's create this leads one by one we will be using the same dataset as we use for our last lecture.

25
00:03:18,000 --> 00:03:27,030
‫First we need to create the input layer we are calling it input and we are creating it with get us lot

26
00:03:27,030 --> 00:03:36,720
‫layers not input and then we have to provide the shape we can either write just it in decades because

27
00:03:36,720 --> 00:03:44,100
‫we have eight independent variables or we can write it in this way extreme dot shape and then the first

28
00:03:44,190 --> 00:03:45,870
‫and so on attribute.

29
00:03:46,230 --> 00:03:54,870
‫So this is our input layer we will create this kind of layers and then we will connect these layers

30
00:03:55,050 --> 00:03:56,570
‫accordingly.

31
00:03:56,760 --> 00:04:07,460
‫So next we create a dense layer with 30 neurons using the railway activation and you can also notice

32
00:04:07,460 --> 00:04:12,990
‫that we are calling this input player like a function.

33
00:04:12,990 --> 00:04:19,620
‫So this input layer is an input for this hidden layer 1 and we are calling it like a function and that's

34
00:04:19,620 --> 00:04:22,490
‫why we call it functional EPA.

35
00:04:22,680 --> 00:04:34,700
‫We create this kind of layers and we use these layers as functions for our next players so we have connected

36
00:04:34,700 --> 00:04:42,930
‫our input layer our crude on there to now we create our second hidden layer.

37
00:04:43,720 --> 00:04:50,710
‫Again we are using crosstalk layers or bends and we are creating it with today neurones and activation

38
00:04:50,710 --> 00:04:52,230
‫is a loop.

39
00:04:52,300 --> 00:04:57,700
‫And now for this certain layer to our input should be that layer 1.

40
00:04:57,730 --> 00:05:02,100
‫So we are passing our hat on Layer 1 as a function.

41
00:05:03,010 --> 00:05:05,870
‫Now the next step is this.

42
00:05:05,920 --> 00:05:09,980
‫Here we warn the output of this done there too.

43
00:05:10,150 --> 00:05:14,250
‫And also we want all lower inputs variable here.

44
00:05:14,290 --> 00:05:15,710
‫You can see this linkage.

45
00:05:15,760 --> 00:05:24,760
‫We want these two to go into this contact layer contact layer is just merging the output of the Sudan

46
00:05:24,760 --> 00:05:26,560
‫layer and all the inputs

47
00:05:29,760 --> 00:05:33,150
‫so we can write this layer as concrete.

48
00:05:33,300 --> 00:05:43,420
‫And then we'll use get us dot layer dots concrete and we are passing the list of output off on layer

49
00:05:43,420 --> 00:05:51,920
‫2 and importantly if we wanted output offered on Layer 1 that's what we can write.

50
00:05:52,190 --> 00:05:58,180
‫Then Layer 1 here and it will add a linkage like this also.

51
00:05:58,180 --> 00:06:06,400
‫So you can see you can customize all these linkages very easily using a functional API now.

52
00:06:06,400 --> 00:06:14,020
‫Next step is to create our output layer our output should get input from the concert layer and this

53
00:06:14,020 --> 00:06:15,730
‫should be a single neuron.

54
00:06:15,850 --> 00:06:24,670
‫So we are creating output equal to give us not layers not bends and then a single neuron without any

55
00:06:24,670 --> 00:06:29,660
‫activation function and input as the output of contact layer.

56
00:06:29,800 --> 00:06:34,450
‫We are passing concrete layer as a function here.

57
00:06:34,470 --> 00:06:37,260
‫Now we have created all the layers.

58
00:06:37,410 --> 00:06:46,830
‫The next step is to combine all this layers and create a modern so we are creating a model object and

59
00:06:46,830 --> 00:06:49,980
‫then we are calling get us dot models dot model.

60
00:06:50,460 --> 00:06:56,490
‫And here we are mentioning what we want as our input and what we want as our output.

61
00:06:56,730 --> 00:07:01,320
‫So what input is this first input layer and our output.

62
00:07:01,320 --> 00:07:05,000
‫Is this last output layer.

63
00:07:05,620 --> 00:07:06,460
‫Sequentially.

64
00:07:06,500 --> 00:07:13,940
‫Yeah we first create the model and then we create each layer layer by layer but then functional API

65
00:07:15,170 --> 00:07:26,200
‫we create this layers and then we join this layers to create this whole network just run this again

66
00:07:26,480 --> 00:07:34,030
‫just to look at the structure of the model that you have created you can call not somebody so what object

67
00:07:34,030 --> 00:07:41,750
‫name is Martin and we are calling somebody might you on this you will get all their details for First

68
00:07:41,770 --> 00:07:49,850
‫we have input layer with it input variables then we have a dense layer with 30 neurons then we ever

69
00:07:49,850 --> 00:07:59,050
‫got a second dense layer again with 30 neurons then we have a concrete layer where we are contacting

70
00:07:59,200 --> 00:08:07,490
‫the input of the second dense layer and our input layer so you can see we have 30 less a so hardier

71
00:08:07,600 --> 00:08:18,480
‫neurons here and then we have output that we have on the one neuron now the next step is to combine

72
00:08:19,480 --> 00:08:28,960
‫and just as in our previous lecture we will be using mean this squared error as loss function as 2D

73
00:08:28,990 --> 00:08:38,740
‫with learning rate of zero point zero zero one as our optimizer and Emmy all mean absolute edit as our

74
00:08:38,940 --> 00:08:49,510
‫traditional metrics to get this on this now fitting the model is same we will just say more the Lord

75
00:08:49,530 --> 00:08:58,860
‫put away another training dataset Lord in a middle epoch we lose and then the valuation let us say since

76
00:08:59,100 --> 00:09:11,710
‫last time we ran that regression more than 40 bucks on switching four equals so let this in more than

77
00:09:16,450 --> 00:09:22,900
‫again you will see the lost functions whether nation lost and the value of my cook says thirty you when

78
00:09:23,140 --> 00:09:23,350
‫you

79
00:09:26,520 --> 00:09:31,740
‫can see the lost value we are getting on our training set is zero point 3 6 this is their messy value

80
00:09:32,250 --> 00:09:40,880
‫and the validation loss is 0 1 3 6 3 8 let's calculate the value on our best data

81
00:09:44,000 --> 00:09:50,140
‫you can see the lost value here is 0 1 3 0 6 4 near.

82
00:09:50,300 --> 00:09:52,010
‫In our last case

83
00:09:56,560 --> 00:09:58,990
‫I guess we were getting law says it all.

84
00:09:59,000 --> 00:10:04,230
‫Warren Buffet fight so this model is not performing that good.

85
00:10:05,080 --> 00:10:14,140
‫But in some situations this kind of deep and wide network performs better than at normal MLB Martin's

86
00:10:15,930 --> 00:10:23,940
‫in this case of a normal MLP model is performing better than this wide and deep net for again you have

87
00:10:24,060 --> 00:10:26,400
‫all the parameters of whenever you do.

88
00:10:26,610 --> 00:10:35,240
‫So if you just read model underscore a student art history you will get all the lost values and the

89
00:10:35,360 --> 00:10:39,590
‫M.A. value that we got during their training.

90
00:10:39,950 --> 00:10:50,140
‫And you can also load this on the graph just like we did on the so the important thing here is that

91
00:10:50,270 --> 00:10:56,760
‫a word valuation loss or valuation and me value is decreasing.

92
00:10:56,890 --> 00:11:02,430
‫So there is a scope of further improvement in the accuracy of our morning.

93
00:11:03,400 --> 00:11:08,760
‫So let's just run this model for 40 months epoch as well.

94
00:11:20,740 --> 00:11:27,560
‫Let's call the best values as well.

95
00:11:27,840 --> 00:11:33,520
‫You can see now the loss has decreased to zero point to six.

96
00:11:33,600 --> 00:11:37,240
‫Earlier we were getting around zero point three.

97
00:11:37,270 --> 00:11:47,420
‫Now we have zero point to six as a lost value we can also applaud this graph again to see whether the

98
00:11:47,500 --> 00:11:50,190
‫model of converts this fame on the

99
00:11:54,230 --> 00:11:54,880
‫you can see.

100
00:11:55,190 --> 00:12:00,620
‫This is almost a straight line so we can see that model has converged.

101
00:12:00,950 --> 00:12:13,370
‫And the best value for our model and the MSE value whatever model is 0 1 2 6 1 on the test say this

102
00:12:13,370 --> 00:12:21,610
‫is somewhat similar to the accuracy we bought earlier with our model MLP model but decent I would say

103
00:12:21,610 --> 00:12:30,810
‫that our normal MLP Wanda was performing better than this deep and wide net for that sold for this lecture

104
00:12:30,870 --> 00:12:31,800
‫in the next lecture.

105
00:12:31,800 --> 00:12:39,820
‫We will see how to say well what model and how to save check points at the end of each epochs.

106
00:12:39,840 --> 00:12:40,260
‫Thank you.

