1
00:00:00,090 --> 00:00:08,370
If you could recall from the previous sections, the way the dense layer is built is such that if we

2
00:00:08,370 --> 00:00:15,330
have this as our dense layer and that we have this input right here, let's call this I, let's say

3
00:00:15,330 --> 00:00:22,560
it's X, we have this input x, then we have a certain M times x.

4
00:00:23,380 --> 00:00:25,090
Let's see.

5
00:00:25,510 --> 00:00:28,060
So this is actually the weights.

6
00:00:28,060 --> 00:00:30,640
So the weights times X.

7
00:00:31,370 --> 00:00:33,460
Plus the bias.

8
00:00:33,740 --> 00:00:41,000
B Let's call this B and then this is now equal our output.

9
00:00:41,150 --> 00:00:44,750
So here we have an output of y, y equals max plus C.

10
00:00:44,900 --> 00:00:53,930
So this simply means if we want to recreate the dense layer, then we have to take this into consideration

11
00:00:53,930 --> 00:00:58,970
all the definition of the layer from scratch into consideration.

12
00:00:59,960 --> 00:01:03,830
That said, we could define neural learn dance.

13
00:01:03,830 --> 00:01:09,050
So it's like our custom dance mural and dance is going to.

14
00:01:10,380 --> 00:01:14,970
Inherit from Lear and then right here, this class.

15
00:01:14,970 --> 00:01:18,180
So we have this class inherits from layer.

16
00:01:18,300 --> 00:01:32,820
Then from here we have our init method in it, and then we have the super we pass in neural learn dance.

17
00:01:33,090 --> 00:01:33,930
There we go.

18
00:01:34,230 --> 00:01:36,690
We have that that neat.

19
00:01:38,040 --> 00:01:38,640
So that's it.

20
00:01:39,480 --> 00:01:46,410
Now, from here, if you notice, whenever we create an advanced layer, like let's call up here, whenever

21
00:01:46,410 --> 00:01:53,880
we create an advanced layer, we generally had to specify, at least because this is by default, this

22
00:01:53,880 --> 00:02:03,300
is not by default, but so here we need to pass in this year to specify the number of output units.

23
00:02:03,690 --> 00:02:09,220
Now, that set, we have to take that into consideration when building our neural and dense layer.

24
00:02:09,240 --> 00:02:12,900
So here we have output units.

25
00:02:13,110 --> 00:02:15,360
There we go, we have our output units.

26
00:02:15,570 --> 00:02:24,210
And then in here we'll define self output units to be equal output units.

27
00:02:24,450 --> 00:02:25,200
So that's it.

28
00:02:25,560 --> 00:02:29,430
Now, from this point, we are going to build this layer.

29
00:02:29,430 --> 00:02:34,830
In order to build this layer, we have to take into consideration this definition right here.

30
00:02:36,630 --> 00:02:40,440
But this definition put out a where this isn't very clear.

31
00:02:40,800 --> 00:02:43,350
Now, let's make this or let's break this up.

32
00:02:43,350 --> 00:02:51,620
So you suppose now we have this input of shape batch size by, let's say, number of features.

33
00:02:51,630 --> 00:02:54,090
So suppose we have this input.

34
00:02:54,480 --> 00:03:03,210
Now what happens here is this input is going to be multiplied by this weight, so it's going to multiply

35
00:03:03,210 --> 00:03:06,060
by this weight, which happens to be a matrix.

36
00:03:06,330 --> 00:03:13,830
Now we take this and multiply by that matrix, and for the multiplication to be valid, we need to ensure

37
00:03:13,830 --> 00:03:19,350
that the number of columns we have here matches with the number of rows of this matrix.

38
00:03:19,470 --> 00:03:23,310
But then what are the dimensions of this matrix?

39
00:03:24,030 --> 00:03:33,900
We have to note that this matrix has to be defined such that we have a shape of F.

40
00:03:34,350 --> 00:03:36,540
This f here must match.

41
00:03:37,430 --> 00:03:40,160
By the number of output units.

42
00:03:40,160 --> 00:03:45,320
So if you want a number of Apple units, for example to be one, then you should have one.

43
00:03:45,320 --> 00:03:52,460
And that's why when defining the dense layer, we don't need to specify this as this.

44
00:03:53,760 --> 00:03:59,460
Value is gotten automatically from the number of columns in the inputs.

45
00:03:59,460 --> 00:04:04,450
Since if we don't take this, we are going to have an error.

46
00:04:04,470 --> 00:04:13,050
So TensorFlow takes this automatically fits in here and then collects the input you pass in the dense.

47
00:04:13,050 --> 00:04:19,950
So when you specify, when you say you have this dense like this, and then you say, for example one

48
00:04:19,950 --> 00:04:22,770
and maybe some activation, then you have that.

49
00:04:22,770 --> 00:04:25,080
So let's say we have some activation here.

50
00:04:25,110 --> 00:04:31,070
Now, once this one gets here, this waits matrix is now defined.

51
00:04:31,080 --> 00:04:37,790
So that input you pattern to this dance is going to affect the number of rows we have.

52
00:04:37,800 --> 00:04:43,560
But then what you pass in as argument here is going to tell us or give us a number of columns we're

53
00:04:43,560 --> 00:04:46,620
going to have for this weight matrix right here.

54
00:04:46,830 --> 00:04:50,190
So that set we have F by one.

55
00:04:50,190 --> 00:04:55,500
And then when we multiply this, we're going to have B by one.

56
00:04:55,500 --> 00:05:02,550
So we see we now understand how we get this output and then we have plus B by one.

57
00:05:03,180 --> 00:05:04,010
So that's it.

58
00:05:05,850 --> 00:05:08,220
Where this one come from, the bias.

59
00:05:08,640 --> 00:05:15,450
Now once we add this up, we have an output of B by one, and that's our Y, that's the shape of our

60
00:05:15,450 --> 00:05:15,990
Y.

61
00:05:16,950 --> 00:05:20,010
If that's understood, we will move on to building.

62
00:05:20,010 --> 00:05:26,330
So here we have our build method, we have self and for now let's keep it that way.

63
00:05:26,340 --> 00:05:27,630
So there we go.

64
00:05:27,630 --> 00:05:31,470
We have this build method and then we're going to define our weights.

65
00:05:31,470 --> 00:05:36,750
So we have self got weights, let's specify weights equal that.

66
00:05:36,750 --> 00:05:42,660
And then here we're going to have the self dot add weights method.

67
00:05:42,660 --> 00:05:47,730
So this actually comes with this layer class right here.

68
00:05:47,730 --> 00:05:51,720
So we're able to call this because we are inheriting from the layer class.

69
00:05:51,720 --> 00:05:55,740
So here we have self add weight and now we specify the shape.

70
00:05:55,740 --> 00:05:56,820
So guess what?

71
00:05:56,820 --> 00:06:05,880
We are going to have our number of rows, so in rows, which is going to come from the input and then

72
00:06:05,880 --> 00:06:11,460
we are going to have a number of columns which is going to come from this output units you specified

73
00:06:11,460 --> 00:06:16,260
which you pass in one calendar, nutrient dense layer.

74
00:06:16,260 --> 00:06:23,320
So here we're going to have self dot output units, so we get a number of output units.

75
00:06:23,340 --> 00:06:26,460
Now how do we obtain this number of input units?

76
00:06:26,460 --> 00:06:27,830
We're going to look at that shortly.

77
00:06:27,840 --> 00:06:31,500
For now we have the weights and then we have that.

78
00:06:31,500 --> 00:06:33,720
And then let's go to bias.

79
00:06:33,720 --> 00:06:38,160
So we have self biases and self.

80
00:06:38,400 --> 00:06:41,340
This is also a weight and weights.

81
00:06:42,660 --> 00:06:49,140
And then we specify does a number of output units since it's one dimensional.

82
00:06:49,140 --> 00:06:54,510
So we have output units that we go.

83
00:06:54,510 --> 00:07:01,020
And so at this point we've defined our weights and our bias metrics.

84
00:07:01,020 --> 00:07:03,840
This actually weight just similar.

85
00:07:03,840 --> 00:07:04,670
So that's it.

86
00:07:04,680 --> 00:07:06,410
Now let's go into call.

87
00:07:06,420 --> 00:07:08,370
So we have our call method.

88
00:07:09,030 --> 00:07:13,230
We should actually get a job done and then we have our input.

89
00:07:13,230 --> 00:07:23,460
So let's, let's call this input, let's add an S or let's say input data or input features.

90
00:07:23,460 --> 00:07:24,480
So there we go.

91
00:07:24,480 --> 00:07:25,590
So we have that.

92
00:07:25,590 --> 00:07:33,540
And then what we're going to do here is we're going to return simply the matrix multiplication as we've

93
00:07:33,540 --> 00:07:35,760
seen already of the weights.

94
00:07:35,760 --> 00:07:45,660
So we have self that weights and this input features we call is actually the input features times the

95
00:07:45,660 --> 00:07:46,290
weights.

96
00:07:46,620 --> 00:07:48,420
Let's get back to this here.

97
00:07:48,420 --> 00:07:54,330
If we take the input, we are the input times the weights, because if we have the weights times the

98
00:07:54,330 --> 00:08:03,650
input, we will have F one times, b, f we call our weights does our weight shape and then these are

99
00:08:03,660 --> 00:08:05,100
input shape our pen.

100
00:08:05,100 --> 00:08:07,830
This you see that this is always the same.

101
00:08:07,830 --> 00:08:11,190
So it's not going to be we're going to try an error.

102
00:08:11,190 --> 00:08:17,880
And so that said, what we're going to do is we're going to just simply have input features, right

103
00:08:17,880 --> 00:08:18,360
your.

104
00:08:19,340 --> 00:08:20,150
So that's it.

105
00:08:20,150 --> 00:08:22,950
And then we add up the bias.

106
00:08:22,970 --> 00:08:30,170
So yeah, we have self dot biases, which is from this one right here.

107
00:08:30,290 --> 00:08:34,860
Now the where we're going to get this number of rows is going to be easy here.

108
00:08:34,880 --> 00:08:42,490
All we need to do is to specify here that we have the input feature features sheep.

109
00:08:42,500 --> 00:08:49,340
We have our input features shape, which is going to come automatically from this and then to get the

110
00:08:49,340 --> 00:08:54,740
number of rows, what we need to do here is have input features.

111
00:08:55,620 --> 00:09:00,630
Shape, and then we get that last dimension.

112
00:09:00,810 --> 00:09:02,370
So let's get back to this.

113
00:09:02,370 --> 00:09:13,320
We call here we have B by F, that's B, F, And so, yeah, we need the weights needs to be F.

114
00:09:14,130 --> 00:09:15,480
By the output.

115
00:09:15,960 --> 00:09:23,040
And so to get this f we just need to take the input and then get this last element right here and get

116
00:09:23,040 --> 00:09:24,450
is the number.

117
00:09:25,720 --> 00:09:27,310
Of columns we have here.

118
00:09:27,640 --> 00:09:28,480
So that's that.

119
00:09:28,570 --> 00:09:36,010
We will just have that input features specify this, take this index and that would be good.

120
00:09:36,010 --> 00:09:42,880
So that's how we get we obtain this value automatically the number of rows we've been looking for.

121
00:09:42,880 --> 00:09:44,500
So now we've gotten this.

122
00:09:44,500 --> 00:09:45,940
Everything seems fine.

123
00:09:46,120 --> 00:09:49,660
Next thing to do is specify that it's trainable.

124
00:09:49,660 --> 00:09:54,970
So we have to specify that these weights are trainable because in some cases we want that which shouldn't

125
00:09:54,970 --> 00:09:55,540
be trainable.

126
00:09:55,540 --> 00:09:57,790
So self the trainable equal.

127
00:09:58,300 --> 00:09:59,110
True.

128
00:09:59,500 --> 00:10:05,980
Now this most of you, this is an argument like it's one of the arguments which I've been passing to

129
00:10:05,980 --> 00:10:08,500
this add weight method right here.

130
00:10:08,500 --> 00:10:09,580
So we just have that.

131
00:10:09,580 --> 00:10:15,130
And then here again, we have this trainable, equal, true.

132
00:10:15,490 --> 00:10:22,030
Now, apart from that, we could randomly initialize our weights and biases.

133
00:10:22,030 --> 00:10:28,810
So here we have random normal, random normal initialization.

134
00:10:29,700 --> 00:10:30,450
That's fine.

135
00:10:31,410 --> 00:10:32,820
And then we do same year.

136
00:10:32,820 --> 00:10:35,400
So we have our initialize

137
00:10:37,770 --> 00:10:40,200
equals random normal.

138
00:10:41,400 --> 00:10:42,120
That's good.

139
00:10:42,770 --> 00:10:43,610
We now run.

140
00:10:43,610 --> 00:10:48,200
This is our neural learning dense layer.

141
00:10:48,200 --> 00:10:52,820
And then after running this, let's get to integrated.

142
00:10:52,850 --> 00:10:54,320
Let's make it quite simple.

143
00:10:54,320 --> 00:10:57,360
We just use our sequential API.

144
00:10:57,380 --> 00:11:01,040
So let's get back to the sequential API we had built initially.

145
00:11:01,820 --> 00:11:02,480
Copy that.

146
00:11:02,480 --> 00:11:09,380
And then we are going to integrate this new dense layer, this neural learn custom dense layer.

147
00:11:10,040 --> 00:11:11,210
So there you go.

148
00:11:11,210 --> 00:11:11,960
We have that.

149
00:11:11,960 --> 00:11:17,000
And then instead of dense layer, yeah, we have neural learned, custom dense layer.

150
00:11:17,000 --> 00:11:22,940
So you see that you can be able to create your own layers with TensorFlow.

151
00:11:23,540 --> 00:11:24,320
So that's it.

152
00:11:25,190 --> 00:11:25,870
Neural learning.

153
00:11:25,880 --> 00:11:28,250
So neural and dense neural and dense.

154
00:11:28,280 --> 00:11:32,300
Now, you said this is an error because we don't take into consideration activation.

155
00:11:33,290 --> 00:11:38,180
And so what we could do is we could get back right here and see.

156
00:11:40,230 --> 00:11:45,150
If the activation is equal value.

157
00:11:45,420 --> 00:11:49,560
So if activation is equal value, we'll return this.

158
00:11:49,560 --> 00:11:58,560
And then else, let's say leaf activation is equal sigmoid.

159
00:11:59,460 --> 00:12:00,680
We return that.

160
00:12:00,690 --> 00:12:03,180
Oops, let's get back.

161
00:12:04,350 --> 00:12:05,490
We have this.

162
00:12:05,490 --> 00:12:11,370
So we're going to return this, this, copy this, and then paste out right here.

163
00:12:11,370 --> 00:12:14,760
And then we have else the same.

164
00:12:14,760 --> 00:12:18,180
So now let's get into this and see the modifications we're going to make.

165
00:12:19,140 --> 00:12:20,100
There we go.

166
00:12:20,130 --> 00:12:23,250
So here we have our blue in the case of red blue.

167
00:12:23,280 --> 00:12:30,190
What we wanted to have here is ttf dot and in lieu of that.

168
00:12:30,270 --> 00:12:32,850
So we're going to pass this in to our blue.

169
00:12:32,850 --> 00:12:39,990
And then if the activation is sigmoid, we should have to have the mad dot sigmoid.

170
00:12:40,140 --> 00:12:40,980
So that's it.

171
00:12:41,970 --> 00:12:42,570
There we go.

172
00:12:42,570 --> 00:12:43,410
We have that.

173
00:12:43,410 --> 00:12:45,950
And then this one we just maintain.

174
00:12:45,960 --> 00:12:51,330
So we have modified our code so that we now integrate the activations.

175
00:12:52,050 --> 00:12:52,950
There we go.

176
00:12:54,180 --> 00:12:55,110
That sounds fine.

177
00:12:55,110 --> 00:12:57,360
So let's run this and then.

178
00:12:59,590 --> 00:13:01,090
Um, take this error.

179
00:13:01,930 --> 00:13:06,940
So here we have this double equals python syntax.

180
00:13:07,180 --> 00:13:11,830
And then we come down here, everything seems fine.

181
00:13:13,030 --> 00:13:14,320
Um, let's take this again.

182
00:13:14,680 --> 00:13:15,610
Yeah, We're supposed to have.

183
00:13:15,610 --> 00:13:17,770
We're supposed to specify activation.

184
00:13:17,920 --> 00:13:24,440
So you're supposed to have activation, and then self activation.

185
00:13:24,580 --> 00:13:26,320
You call activation.

186
00:13:26,680 --> 00:13:29,170
And then here we have self.

187
00:13:29,170 --> 00:13:30,580
That activation.

188
00:13:31,000 --> 00:13:31,720
Self.

189
00:13:31,720 --> 00:13:35,530
That activation, we run that again looks fine.

190
00:13:35,740 --> 00:13:38,230
And then now what we do is we pass.

191
00:13:38,230 --> 00:13:39,910
We just simply run this.

192
00:13:39,910 --> 00:13:42,100
So we run that and.

193
00:13:43,720 --> 00:13:47,880
We get into error right here to solve this problem.

194
00:13:47,890 --> 00:13:51,520
What we're gonna do is we're going to include the shape here.

195
00:13:51,520 --> 00:13:57,790
So we're going to have shape equal that and then shape equal this as we run that.

196
00:13:57,790 --> 00:14:00,790
And we should have no error again.

197
00:14:01,090 --> 00:14:06,790
Oh yeah, we haven't cancelled the attributes weights likely because it conflicts with an existing read

198
00:14:06,790 --> 00:14:09,370
only property of the object.

199
00:14:09,370 --> 00:14:09,850
So.

200
00:14:09,850 --> 00:14:10,180
Right.

201
00:14:10,180 --> 00:14:10,720
Yeah.

202
00:14:10,720 --> 00:14:16,750
Instead of using weights, we're just going to say w so we have the W and then.

203
00:14:17,770 --> 00:14:18,160
Right.

204
00:14:18,160 --> 00:14:18,370
Yeah.

205
00:14:18,370 --> 00:14:26,590
We should have W now since we don't want to repeat this over or what would you say we have pre output.

206
00:14:27,010 --> 00:14:37,870
So we have our pre output which is this here and pays that out and then yeah we have W right.

207
00:14:37,870 --> 00:14:38,050
Yeah.

208
00:14:38,050 --> 00:14:49,030
We have B does it and now we have if this then we pass the pre output and we do the same right here.

209
00:14:49,390 --> 00:14:54,370
So we have just a pre output before the activation.

210
00:14:55,420 --> 00:14:57,550
Now here we just have pre output actually.

211
00:14:57,550 --> 00:14:59,260
So we just have pre output.

212
00:14:59,350 --> 00:15:01,360
Take that off now running.

213
00:15:01,360 --> 00:15:02,350
This should be fine.

214
00:15:02,350 --> 00:15:04,660
So let's run that and see what we get.

215
00:15:04,960 --> 00:15:11,230
We run this, we run that and there we go, We have our model, the same exact model we have been building

216
00:15:11,230 --> 00:15:14,020
right from the start, but this time around we haven't.

217
00:15:14,020 --> 00:15:19,150
Are we using a custom dense layer, which is our neural learning dance layer?

218
00:15:19,180 --> 00:15:24,340
Now let's go ahead and compile this model and then train it.

219
00:15:26,140 --> 00:15:32,140
Yeah, we've actually maintained this year, so we have to change this name.

220
00:15:32,770 --> 00:15:34,600
Here is the net model.

221
00:15:34,600 --> 00:15:36,730
Let's see the net custom.

222
00:15:37,900 --> 00:15:39,460
Yeah, Custom.

223
00:15:40,030 --> 00:15:41,710
Uh, custom model.

224
00:15:41,710 --> 00:15:42,100
Yeah.

225
00:15:42,760 --> 00:15:43,180
Okay.

226
00:15:43,180 --> 00:15:44,170
We copied that.

227
00:15:45,310 --> 00:15:47,560
We have all the net custom model.

228
00:15:48,880 --> 00:15:50,200
Let's have it here.

229
00:15:52,000 --> 00:15:52,690
There we go.

230
00:15:52,690 --> 00:15:53,380
We run it.

231
00:15:54,880 --> 00:15:56,920
That's fine, right?

232
00:15:56,920 --> 00:15:57,130
Yeah.

233
00:15:57,130 --> 00:15:58,900
We have the net custom.

234
00:15:59,050 --> 00:16:00,070
We run that.

235
00:16:00,850 --> 00:16:02,800
Yeah, we have the net custom.

236
00:16:02,800 --> 00:16:05,290
We run that, and then let's check out.

237
00:16:07,090 --> 00:16:13,510
As you could see with the training, there's a slight difference in the last values and accuracy we

238
00:16:13,510 --> 00:16:15,250
get getting for this first epoch.

239
00:16:15,250 --> 00:16:22,750
And most probably this is coming from this random normal initialization we've chosen here.

240
00:16:22,750 --> 00:16:32,890
As with the standard dense layer, which you could see here is kernel initialization or the weight initialization

241
00:16:32,890 --> 00:16:38,500
is using this LORO uniform and the bias initialization is the zeros.

242
00:16:38,500 --> 00:16:44,400
So here we have all zeros for the biases and then we use the uniform method.

243
00:16:44,410 --> 00:16:50,050
So here, as you could see, this isn't performing as well as what we had previously, but scrolling

244
00:16:50,050 --> 00:16:51,520
up, there is this error.

245
00:16:51,520 --> 00:16:53,380
So yeah, it was meant to be sigmoid.

246
00:16:53,500 --> 00:17:02,200
So let's stop this, let's interrupt this training and then get back and run this.

247
00:17:02,320 --> 00:17:09,130
So we will run this and compile and then start with the training again.

248
00:17:10,060 --> 00:17:11,710
This time around it looks better.

249
00:17:11,710 --> 00:17:14,200
It looks more like what we should expect.

250
00:17:14,620 --> 00:17:21,340
So even though we're not using the initialization method, this training process looks quite similar

251
00:17:21,340 --> 00:17:28,240
to what we would have had in the case of the initialization method, which is used with a standard dense

252
00:17:28,240 --> 00:17:28,750
layer.

253
00:17:28,750 --> 00:17:33,130
And here is what we get after training for over five epochs.

254
00:17:33,130 --> 00:17:36,670
Thank you for getting up to this point and see you next time.
