1
00:00:00,360 --> 00:00:00,990
Either.

2
00:00:00,990 --> 00:00:06,390
And welcome to this new and exciting session in which we shall be looking at different strategies to

3
00:00:06,390 --> 00:00:07,620
reduce overfitting.

4
00:00:07,620 --> 00:00:14,940
And in the YOLO v one paper, some strategies were underlined to avoid overfitting they use drop out

5
00:00:14,940 --> 00:00:17,100
and extensive data augmentation.

6
00:00:17,100 --> 00:00:25,680
Now a drop out layer with rate 0.5 after the first connected layer prevents co adaptation between layers.

7
00:00:25,680 --> 00:00:31,980
And so you see that after this fully connected layer, we are going to have the drop out and we're going

8
00:00:31,980 --> 00:00:34,860
to give it parameter 0.5.

9
00:00:34,890 --> 00:00:42,390
Then for the data augmentation, the authors introduce random scaling and translations of up to 20%

10
00:00:42,390 --> 00:00:44,670
of the original image size.

11
00:00:44,670 --> 00:00:53,670
Then they also randomly adjust the exposure and saturation of the image by a factor of 1.5 in the HSV

12
00:00:53,670 --> 00:00:54,750
color space.

13
00:00:54,750 --> 00:01:00,570
So that said, we are going to break up our data augmentation strategies into two main categories as

14
00:01:00,570 --> 00:01:08,490
the very first category will entail modifying the pixel values without modifying the positions of the

15
00:01:08,490 --> 00:01:09,660
different objects.

16
00:01:09,660 --> 00:01:11,670
So we could have something like this.

17
00:01:11,670 --> 00:01:19,140
Let's click an edit right here, and then let's try to say brighten up the image.

18
00:01:19,290 --> 00:01:21,870
You see, we could modify this like this.

19
00:01:21,870 --> 00:01:30,270
So we go from this initial image to this by playing around the brightness, playing around with colorization

20
00:01:30,270 --> 00:01:31,500
and so on and so forth.

21
00:01:31,500 --> 00:01:38,700
So this first category, as we've said already, entails just modifying the different pixel values without

22
00:01:38,700 --> 00:01:42,690
any change in position of any object we have here.

23
00:01:42,690 --> 00:01:51,210
And then for the second category, we could go from this image to this one where you see that this flipping

24
00:01:51,210 --> 00:02:00,870
has made this object position to go from here to this position right here and now.

25
00:02:00,870 --> 00:02:08,550
In the first case where we just modify, for example, the image brightness, there is little or no

26
00:02:08,580 --> 00:02:12,030
updates made to our existing code base.

27
00:02:12,030 --> 00:02:19,710
But when we have to modify the image such that the bounding boxes have to be changed or the positions

28
00:02:19,710 --> 00:02:25,770
of the bounding boxes have to be changed like this one here, or this one which will go from year to

29
00:02:25,800 --> 00:02:31,230
year, this one year which goes from year to this other one.

30
00:02:31,230 --> 00:02:39,000
It means that we are now updating this bounding boxes and so we would have to write some extra code

31
00:02:39,000 --> 00:02:41,940
for all these different modifications.

32
00:02:42,090 --> 00:02:50,100
Now, nonetheless, it turns out that when we work with a library like Album notations take this off.

33
00:02:50,100 --> 00:02:57,990
When we work with album annotations, all changes made in the positions of the bounding boxes are carried

34
00:02:57,990 --> 00:02:59,340
out automatically.

35
00:02:59,340 --> 00:03:05,370
So you could see here we have this input image with this dock, this tennis ball and this cut.

36
00:03:05,370 --> 00:03:12,630
And then after going through some transformation, like here we see we have some transformation on this

37
00:03:12,630 --> 00:03:14,970
image where first of all, the image is flipped.

38
00:03:14,970 --> 00:03:21,870
So you see the dock moves to this other position and then the image also appears zoomed in.

39
00:03:21,870 --> 00:03:31,230
So you see that this cut, for example, is now not as complete or we don't have the complete cut as

40
00:03:31,230 --> 00:03:33,240
we had in this original image.

41
00:03:33,720 --> 00:03:37,500
And so now we have completely different bounding boxes.

42
00:03:37,500 --> 00:03:45,210
This one, for example, becomes this, this tennis ball becomes this, this docks bounding box becomes

43
00:03:45,210 --> 00:03:45,510
this.

44
00:03:45,510 --> 00:03:49,320
So it becomes you see that it becomes larger as compared to the inputs.

45
00:03:49,320 --> 00:03:56,940
And so with implementations, as we're saying, you have this input bounding boxes like you could see

46
00:03:56,940 --> 00:04:11,070
for the dock at this position, 23, 74 and 295 388 is automatically converted to one 4969 like you

47
00:04:11,070 --> 00:04:22,290
see your 295 381 So you're you just need to define your transformations and then argumentation.

48
00:04:22,410 --> 00:04:27,960
Make sure you have the right bounding boxes as output.

49
00:04:28,470 --> 00:04:33,210
So now diving into the code as you might have seen in some previous sessions, we are going to import

50
00:04:33,510 --> 00:04:34,830
albumin patients.

51
00:04:34,830 --> 00:04:37,710
So that's the import of our team implementation.

52
00:04:37,710 --> 00:04:41,730
We now move on to integrate our transform.

53
00:04:41,730 --> 00:04:45,870
You see right here we have this different transforms.

54
00:04:45,870 --> 00:04:48,360
The first thing we'll do is to resize our image.

55
00:04:48,360 --> 00:04:53,010
So it's 224 by 224 and then we'll apply a random crop.

56
00:04:53,010 --> 00:04:59,830
Now this random crop is applied such that the output image will have a height or right.

57
00:05:00,090 --> 00:05:06,120
Width line between 202 24 and a high line between 202 24.

58
00:05:06,120 --> 00:05:18,920
So we could just have this height or let's say height -20, or we could just say 0.9% of the height.

59
00:05:18,930 --> 00:05:21,300
So we have that.

60
00:05:21,300 --> 00:05:28,350
We go from that to and then here we also have 90% of the width.

61
00:05:28,350 --> 00:05:32,310
So we have 0.9 of the width.

62
00:05:33,060 --> 00:05:37,530
Take that off and we go here we have the width.

63
00:05:37,560 --> 00:05:38,190
Okay.

64
00:05:38,190 --> 00:05:43,980
So what we're saying here is, as we've had already, we want to randomly crop the image.

65
00:05:43,980 --> 00:05:51,090
So we have the image Now our new image will have a width which will fall in this range and the height

66
00:05:51,090 --> 00:05:58,890
which will fall in this range and then one half of probability of 0.5 of applying this transformation.

67
00:05:58,890 --> 00:06:07,800
So if you want this transformation to always be applied, then you could always set, always apply to

68
00:06:07,800 --> 00:06:08,430
true.

69
00:06:08,460 --> 00:06:16,570
If not, you will just have p all the probability of applying to set to 0.5 or maybe even 0.2 or say

70
00:06:16,620 --> 00:06:18,930
0.8, it just depends on you.

71
00:06:18,930 --> 00:06:20,190
So that's it.

72
00:06:20,190 --> 00:06:23,910
Now for the next we have this random scaling here.

73
00:06:23,910 --> 00:06:25,800
We specify the skill and limit.

74
00:06:25,800 --> 00:06:30,330
We have the interpolation type and then again we have this probability.

75
00:06:30,330 --> 00:06:33,270
Then, yeah, we have the horizontal flip.

76
00:06:33,270 --> 00:06:38,400
So we're going to apply horizontal flip and probability set to 0.5.

77
00:06:38,430 --> 00:06:44,370
Now, finally, because after doing this random crop, we will have an image which is not 224.

78
00:06:44,370 --> 00:06:48,840
By 224 we actually resizes back to two, 24 by 224.

79
00:06:48,840 --> 00:06:49,800
So that's it.

80
00:06:49,800 --> 00:06:50,940
That's all you need.

81
00:06:50,940 --> 00:06:56,280
That is all our transformations here which we pass in this compost method.

82
00:06:56,280 --> 00:07:02,640
So it's basically a list made of this, transform this, transform this, transform this and this.

83
00:07:02,640 --> 00:07:10,530
Now, one additional term we're going to pass in this compost method is this bounding box parameters.

84
00:07:10,530 --> 00:07:17,620
And the reason why we need to pass this is simply because, unlike with the image classification, with

85
00:07:17,620 --> 00:07:23,550
the object detection, we have bounding boxes which are going to be modified.

86
00:07:23,550 --> 00:07:26,150
So here we specify this bounding box params.

87
00:07:26,200 --> 00:07:30,090
So take to consideration the kinds of boxes we're dealing with.

88
00:07:30,090 --> 00:07:33,060
And here you notice that we specify a format yolo.

89
00:07:33,720 --> 00:07:37,020
Now getting back to your documentation, we actually have three formats.

90
00:07:37,020 --> 00:07:41,550
We have the Pascal Vos format implementations, format, cocoa format.

91
00:07:41,880 --> 00:07:46,080
Here you'll format Pascal Vos augmentations.

92
00:07:46,080 --> 00:07:52,320
Now it turns out that in our specific case we actually dealing with the format not just because we built

93
00:07:52,320 --> 00:08:00,690
in a YOLO model, but because the way we've normalized our inputs or process our inputs is such that

94
00:08:00,690 --> 00:08:09,390
we have our x center, y center width and height representing our bounding boxes.

95
00:08:09,390 --> 00:08:14,100
So if we had instead X mean y, mean with height would have picked this.

96
00:08:14,100 --> 00:08:19,350
So it's not, it's not, it's not because of the name, although it's actually coincides with the fact

97
00:08:19,350 --> 00:08:21,420
that we're building a YOLO model.

98
00:08:21,420 --> 00:08:26,160
But then as we said, we have an X enter Y center with height.

99
00:08:26,160 --> 00:08:29,730
And again, this is normalized notice here is normalized.

100
00:08:29,730 --> 00:08:37,020
So this year is divided by the width, this divided by the height, this is divided by the width and

101
00:08:37,020 --> 00:08:38,670
then this divided by a height.

102
00:08:38,670 --> 00:08:44,580
Remember, this is the width and the height of the specific bone and box, which happens to be exactly

103
00:08:44,580 --> 00:08:46,910
what we have seen when we're doing this here.

104
00:08:46,920 --> 00:08:52,440
Remember, we took the X mean Y mean we obtained the X center and then we divide it by the width to

105
00:08:52,440 --> 00:08:59,490
the Y mean y Max, open the y center divided by the height to the width, divided by the total width

106
00:08:59,490 --> 00:09:00,210
to the height.

107
00:09:00,210 --> 00:09:01,290
Divided by the total height.

108
00:09:01,290 --> 00:09:07,740
So it's the YOLO format we actually using right here.

109
00:09:08,580 --> 00:09:10,200
Now again gets into the code.

110
00:09:10,200 --> 00:09:12,960
You see we have the format YOLO specified.

111
00:09:12,960 --> 00:09:18,300
Now here we have this mean area set to 25 and this mean visibility is set to 0.1.

112
00:09:18,600 --> 00:09:23,760
Now, to understand the concept of the mean area, consider you have this input right here.

113
00:09:23,760 --> 00:09:32,130
And then after carrying out the transforms, what you have is say this in this output here.

114
00:09:32,130 --> 00:09:39,780
So we have this output where the area of the this box is 4344 pixels.

115
00:09:39,780 --> 00:09:41,280
This is actually the cocoa format.

116
00:09:41,280 --> 00:09:45,840
So you find that the bounding boxes will be different from the kind of bounding boxes we have.

117
00:09:45,870 --> 00:09:54,840
Nonetheless, the error as set after transformation is 4344 pixels, so clearly it's quite small compared

118
00:09:54,840 --> 00:09:59,370
to the 23,892 pixels we had already.

119
00:09:59,610 --> 00:10:04,890
On this 132 times 181 computation we had here.

120
00:10:04,890 --> 00:10:09,330
So after transforming this, we obtain this right here.

121
00:10:09,330 --> 00:10:18,540
But if we set the mean area to 4500, it means that any box, less than 4500 is going to be omitted.

122
00:10:18,540 --> 00:10:26,020
And so that's why you see that when we specify this mean area to be 4500, this box here disappears.

123
00:10:26,070 --> 00:10:28,830
So if you don't set anything, the box remains.

124
00:10:28,830 --> 00:10:31,380
But if you set this, then the box is going to disappear.

125
00:10:31,380 --> 00:10:34,590
And then we also have the mean visibility.

126
00:10:34,590 --> 00:10:37,020
So let's take this here.

127
00:10:37,590 --> 00:10:47,100
If we set the mean visibility to, say 0.3, then if the output box, which is this box divided by the

128
00:10:47,100 --> 00:10:55,770
initial box, which is this is less than or gives us a ratio less than 0.3, it means that box is going

129
00:10:55,770 --> 00:10:56,670
to be omitted.

130
00:10:56,670 --> 00:11:07,590
And so right here, if you take this this area, which is 6888 divided by 24,108, from this original

131
00:11:07,590 --> 00:11:14,730
box here, 24,108, you would have 0.286, which is less than 0.3.

132
00:11:15,000 --> 00:11:19,710
And so when you see you see when it says 0.3, this box up here disappears.

133
00:11:19,710 --> 00:11:21,270
So that's it.

134
00:11:21,270 --> 00:11:26,790
That's that's the idea behind the mean area and the mean visibility.

135
00:11:27,240 --> 00:11:29,160
Now we can actually leave those out.

136
00:11:29,160 --> 00:11:33,060
So let's just take this off and that should be fine.

137
00:11:33,060 --> 00:11:34,290
So this is it.

138
00:11:34,290 --> 00:11:36,990
We have our transform so we could run this.

139
00:11:38,310 --> 00:11:39,240
There we go.

140
00:11:39,240 --> 00:11:45,540
We have our org argument method, which takes in the image, takes in the bounding boxes, and then

141
00:11:45,540 --> 00:11:52,050
all it does is it passes the image and the bounding boxes into our transforms here.

142
00:11:52,050 --> 00:11:59,610
And then we obtain the transformed image and the transformed bound and boxes.

143
00:11:59,910 --> 00:12:08,580
So as we are seeing here, we go from this image bounding box pair to this transformed image transformed

144
00:12:08,580 --> 00:12:09,930
bounding box pair.

145
00:12:10,320 --> 00:12:12,210
So let's get back to the code.

146
00:12:12,210 --> 00:12:21,360
We could run this and then we have our our processed data method which makes use of this TensorFlow

147
00:12:21,360 --> 00:12:29,280
non pi function because actually here these are non TensorFlow pairs which we are calling especially

148
00:12:29,280 --> 00:12:35,400
here when you use make use of abominations, it's made up of computations and non pi.

149
00:12:35,400 --> 00:12:41,250
So we make use of this method right here in order to integrate that in our data pipeline.

150
00:12:41,250 --> 00:12:48,340
So here we specify the function arguments, the inputs image and bounding boxes, the output tensor

151
00:12:49,140 --> 00:12:54,000
here this one's this two year the floats actually floated too.

152
00:12:54,000 --> 00:12:54,750
So that's it.

153
00:12:54,750 --> 00:13:01,050
So we run this and we create our train dataset so we could visualize that.

154
00:13:01,050 --> 00:13:06,510
You see, for example, here we have this image, let's write that image out so we could see it.

155
00:13:06,510 --> 00:13:11,180
We had output one and output two.

156
00:13:11,190 --> 00:13:12,270
Let's check that out.

157
00:13:12,300 --> 00:13:15,780
See here we have this or no modification was made.

158
00:13:15,780 --> 00:13:25,140
So in order to be sure that we make some change, what we could do is we could make sure, for example,

159
00:13:25,140 --> 00:13:32,730
let's let's comment this random scaling and then let's carry out let's make sure the flipping is always

160
00:13:32,730 --> 00:13:33,000
done.

161
00:13:33,000 --> 00:13:34,620
So we have always

162
00:13:37,020 --> 00:13:45,300
true set, always apply set to true.

163
00:13:47,100 --> 00:13:47,820
There we go.

164
00:13:47,820 --> 00:13:55,560
So we run that, run that again and we have output one and then output two, which has been flipped.

165
00:13:55,590 --> 00:14:00,360
Now one thing you would notice is also the fact that it's bound and boxes are changed.

166
00:14:00,360 --> 00:14:07,710
So let's copy this from here and then let's piss it out just here so we could see that.

167
00:14:08,190 --> 00:14:14,280
Okay, so this is what we have before and this is what we have after.

168
00:14:14,820 --> 00:14:21,780
Now you see that this actually makes sense, because when you have this original image, when you have

169
00:14:21,780 --> 00:14:27,810
this original image where you have something like this for the bounding box, oops.

170
00:14:27,810 --> 00:14:30,540
Well, all of that, we have someone like this.

171
00:14:30,900 --> 00:14:38,160
And then when you flip it, you see when you flip it, what actually changes here will only be this

172
00:14:38,160 --> 00:14:39,330
X center.

173
00:14:39,330 --> 00:14:49,170
So if your center was around this, let's say, centers around this, then flipping the X Center changes

174
00:14:49,170 --> 00:14:50,490
position slightly.

175
00:14:50,490 --> 00:14:52,410
And that's what we would notice here.

176
00:14:52,410 --> 00:14:55,530
You notice how does X Center change this position?

177
00:14:55,530 --> 00:14:56,610
Just a little bit.

178
00:14:56,960 --> 00:14:59,340
But for the Y center, it doesn't really change.

179
00:14:59,650 --> 00:15:02,020
The width and the height remains the same.

180
00:15:02,050 --> 00:15:05,950
Now, let's play around with this here.

181
00:15:05,950 --> 00:15:08,530
So let's let's have this random crop now.

182
00:15:08,890 --> 00:15:10,150
Let's do this.

183
00:15:11,800 --> 00:15:13,870
Let's add this to always apply.

184
00:15:14,170 --> 00:15:15,680
It always apply to true.

185
00:15:15,700 --> 00:15:18,030
So we'll have that.

186
00:15:18,040 --> 00:15:23,110
Oops, We will have always apply.

187
00:15:23,120 --> 00:15:25,030
We set it to true.

188
00:15:25,150 --> 00:15:29,800
Run this and check out our output.

189
00:15:29,800 --> 00:15:31,840
So yeah, we'd have.

190
00:15:31,840 --> 00:15:33,010
Okay, we should have.

191
00:15:33,010 --> 00:15:35,740
Let's run this here to have out two.

192
00:15:35,740 --> 00:15:37,960
So we have out one.

193
00:15:37,960 --> 00:15:40,060
And then we have our two.

194
00:15:42,150 --> 00:15:42,780
Okay.

195
00:15:43,620 --> 00:15:46,020
Now what we have is this.

196
00:15:46,020 --> 00:15:50,250
And then this, you see, appears somehow zoomed in.

197
00:15:50,250 --> 00:15:52,290
Anyways, this example.

198
00:15:53,100 --> 00:15:56,740
Now it shows us that we have Let's take this off.

199
00:15:56,760 --> 00:16:03,870
We have this image which has this bottom box here, something like this and this.

200
00:16:03,870 --> 00:16:06,450
And maybe the center around here.

201
00:16:06,450 --> 00:16:14,220
And then now it's modified and you see the height gets modified.

202
00:16:14,220 --> 00:16:18,510
And although not too much, actually, the width doesn't change.

203
00:16:18,510 --> 00:16:20,160
Well, it changes just a little bit.

204
00:16:20,190 --> 00:16:23,790
Now, let's change this example so that we could see this clearly.

205
00:16:24,420 --> 00:16:29,730
This example isn't very demonstrative of this transformation process.

206
00:16:29,730 --> 00:16:35,640
So we'll take maybe the second or instead the a skip.

207
00:16:35,970 --> 00:16:39,900
So here we have Skip here.

208
00:16:39,900 --> 00:16:40,270
Yeah.

209
00:16:40,290 --> 00:16:41,970
Then we break.

210
00:16:41,970 --> 00:16:47,490
Hopefully the second has maybe many more objects or is a better example.

211
00:16:47,490 --> 00:16:49,860
Actually, let's check this out.

212
00:16:51,030 --> 00:16:51,420
Okay.

213
00:16:51,420 --> 00:16:55,410
So after flipping, we expect to have something which is looking different from this.

214
00:16:56,410 --> 00:16:57,300
Okay, so that's it.

215
00:16:57,300 --> 00:17:00,990
Let's take this off for work with this example.

216
00:17:00,990 --> 00:17:05,070
Now let's run this again.

217
00:17:06,060 --> 00:17:08,940
We have this year, we run this.

218
00:17:09,690 --> 00:17:11,220
There we go.

219
00:17:12,630 --> 00:17:14,100
We have that.

220
00:17:14,100 --> 00:17:22,080
We have our output, which we obtain from skipping, and then we get back here and then we also skip.

221
00:17:22,200 --> 00:17:34,500
So we skip and there we go to and then from here we break, we run that and then see what we get.

222
00:17:36,180 --> 00:17:38,760
Now, you could see from here we have our output one.

223
00:17:38,760 --> 00:17:40,590
Let's also check out our output to.

224
00:17:42,880 --> 00:17:43,570
There we go.

225
00:17:43,570 --> 00:17:47,410
We have our one and then our two exactly as we expect.

226
00:17:47,410 --> 00:17:54,940
So that now this this example is much different from what we had before and should be a better way to

227
00:17:54,940 --> 00:17:58,420
demonstrate what goes on in our mutations.

228
00:17:58,420 --> 00:18:03,280
So right here we have this input boxes.

229
00:18:03,280 --> 00:18:04,420
Let's copy that.

230
00:18:04,990 --> 00:18:11,860
We have the input boxes, we're going to paste it just here, our input boxes.

231
00:18:12,400 --> 00:18:13,750
Let's take that off.

232
00:18:16,980 --> 00:18:17,700
There we go.

233
00:18:17,700 --> 00:18:19,650
We have this output boxes.

234
00:18:21,940 --> 00:18:22,900
Copy that.

235
00:18:23,080 --> 00:18:24,730
And then peace right here.

236
00:18:24,760 --> 00:18:28,780
Now, before we move on, notice the fact that the classes are the same.

237
00:18:28,780 --> 00:18:31,480
So class 18 and yes, class 18.

238
00:18:31,480 --> 00:18:33,880
And if you get right to the top, you.

239
00:18:33,880 --> 00:18:37,750
Well, we use classes, so let's let's say classes.

240
00:18:38,530 --> 00:18:40,570
Let's get 18 and see what we get.

241
00:18:40,570 --> 00:18:43,600
It should be trained from here.

242
00:18:43,600 --> 00:18:47,920
We have well, yeah, we have classes.

243
00:18:48,730 --> 00:18:50,890
Classes 18.

244
00:18:51,760 --> 00:18:52,540
There we go.

245
00:18:52,540 --> 00:18:53,620
We see we have train.

246
00:18:53,620 --> 00:18:55,780
Okay, so that is it.

247
00:18:55,780 --> 00:18:57,580
Let's take this off now.

248
00:18:57,580 --> 00:18:59,830
Let's check this out.

249
00:18:59,830 --> 00:19:00,770
We have our art.

250
00:19:00,790 --> 00:19:03,430
One could see from here.

251
00:19:03,430 --> 00:19:08,740
It makes sense that the center is about 27% of the full width.

252
00:19:08,740 --> 00:19:12,850
So this distance is about 27% of all this distance.

253
00:19:12,850 --> 00:19:14,260
So that's it, then?

254
00:19:14,260 --> 00:19:18,580
This distance, too, it's about 36% of all this distance.

255
00:19:19,180 --> 00:19:22,570
And then we have the width.

256
00:19:22,570 --> 00:19:28,810
The width, which is about 0.540, 54% of the total image width.

257
00:19:28,810 --> 00:19:29,770
Let's change the color.

258
00:19:29,800 --> 00:19:34,420
That's about 54% of the total width.

259
00:19:34,420 --> 00:19:39,640
And then this is about 71% of the total height.

260
00:19:40,510 --> 00:19:44,980
Now, after flipping, you see that this has to change.

261
00:19:44,980 --> 00:19:49,390
Let's drag this now, drag this, this way.

262
00:19:49,600 --> 00:19:52,000
And we have something like this.

263
00:19:52,930 --> 00:19:54,370
Let's take this off.

264
00:19:55,630 --> 00:20:03,040
You'll see that this now the center or this distance here, this distance from year to year is about

265
00:20:03,040 --> 00:20:07,630
73% of the full image.

266
00:20:07,640 --> 00:20:13,450
See that X center changes and then the for the for the height, it doesn't really change much or changes

267
00:20:13,450 --> 00:20:14,190
very little.

268
00:20:14,200 --> 00:20:20,410
See that almost the same, the width remains practically the same and then the height remains also almost

269
00:20:20,410 --> 00:20:21,070
the same.

270
00:20:21,460 --> 00:20:29,650
Well, it's the idea here is to show that after going through this augmentations transforms augmentations,

271
00:20:29,650 --> 00:20:37,030
permits us to obtain this output bound in boxes which match up with a transformed image.

272
00:20:37,390 --> 00:20:45,420
Now, don't forget to make sure you change this back from always apply to probabilities of 0.5.

273
00:20:45,430 --> 00:20:46,420
So that's it.

274
00:20:46,420 --> 00:20:53,890
The next set of transformations would make would be with TensorFlow and we make use of TensorFlow image.

275
00:20:53,890 --> 00:20:58,300
So yeah, we have this random brightness, random contrast.

276
00:20:59,290 --> 00:21:02,350
We're going to leave out the random crop for obvious reasons.

277
00:21:02,350 --> 00:21:07,990
Remember, if you have to do this random crop, it means you would have to write the code which permits

278
00:21:07,990 --> 00:21:14,170
you also modify the bounding boxes because when you carry out a random crop, the bounding boxes actually

279
00:21:14,170 --> 00:21:14,620
change.

280
00:21:14,620 --> 00:21:18,400
So that's why we're making use of ABU mutations.

281
00:21:18,400 --> 00:21:21,760
Since it makes life, it makes life much more easy.

282
00:21:21,760 --> 00:21:28,780
And then we use this, we use this, we carry out no, we're not carrying this out.

283
00:21:28,780 --> 00:21:30,010
We carry out.

284
00:21:30,010 --> 00:21:31,000
No, not this.

285
00:21:31,000 --> 00:21:34,960
We carry out random new random saturation.

286
00:21:34,960 --> 00:21:39,400
Okay, So this we're going to make use of You could see that you're the code.

287
00:21:39,400 --> 00:21:46,420
We have brightness, saturation, contrast you and then we find out we carry out this clipping by value

288
00:21:46,420 --> 00:21:49,390
to make sure all the values lie between zero and 255.

289
00:21:49,540 --> 00:21:52,630
Now, you could always feel free to comment on comment.

290
00:21:53,080 --> 00:21:55,540
Any one of this right here.

291
00:21:55,540 --> 00:22:00,100
So that said, we again going to carry out this preprocessing.

292
00:22:00,220 --> 00:22:02,290
See, we have that.

293
00:22:02,290 --> 00:22:07,990
And then remember, this is for the training and then this is for the validation.

294
00:22:07,990 --> 00:22:09,310
So that's it.

295
00:22:09,310 --> 00:22:17,290
We carry out this mapping, we batch pre fetch and then you could check out your outputs right here.

296
00:22:17,290 --> 00:22:21,850
So let's say out one, out two, let's check out our three.

297
00:22:22,300 --> 00:22:23,080
Here we go.

298
00:22:23,080 --> 00:22:24,280
We have our three.

299
00:22:24,610 --> 00:22:27,220
Well, this should be let's go to Skip.

300
00:22:27,220 --> 00:22:34,600
Let's keep this skip two and break.

301
00:22:34,600 --> 00:22:35,200
Okay.

302
00:22:35,230 --> 00:22:37,450
So let's run that again and see what we get out.

303
00:22:37,450 --> 00:22:43,300
One out, two and out, three out, one out, two and out three.

304
00:22:44,440 --> 00:22:49,150
Well, since this was already batched, we will maintain the take.

305
00:22:49,150 --> 00:22:50,830
So we'll take the first.

306
00:22:50,830 --> 00:22:55,990
We'll maintain this, and then, yeah, we'll pick out one.

307
00:22:56,560 --> 00:22:59,800
Run that and there we go.

308
00:23:01,000 --> 00:23:03,700
We still have this so this should be two.

309
00:23:03,790 --> 00:23:04,810
Let's run that.

310
00:23:05,080 --> 00:23:12,730
We have our one, our two and then our three, our one, our two and our three.

311
00:23:12,730 --> 00:23:18,280
Okay, so you see that this now appears much darker as compared to this one.

312
00:23:18,280 --> 00:23:19,030
So that's it.

313
00:23:19,030 --> 00:23:23,410
We have seen how to carry out this different transformations or augmentations.

314
00:23:23,410 --> 00:23:29,380
But before we go on with the training again, one slight modification will make is we'll replace this

315
00:23:29,380 --> 00:23:38,080
resin at 50 with the efficient net be one, then we'll go ahead and compile the model and restart the

316
00:23:38,080 --> 00:23:38,710
training.

317
00:23:39,070 --> 00:23:39,960
There we go.

318
00:23:39,970 --> 00:23:44,800
Train has begun and after training for several epochs years where we obtain.

319
00:23:44,800 --> 00:23:51,610
You can see here that the loss, the loss, the training loss and the validation loss both keep dropping

320
00:23:51,940 --> 00:24:02,500
to see that they all keep dropping up to where we have this hundred and 23 around this year.

321
00:24:02,500 --> 00:24:11,980
So up to around this, our loss keeps dropping and then somehow increases slightly and stabilizes around

322
00:24:11,980 --> 00:24:13,420
128.

323
00:24:13,420 --> 00:24:23,050
So that's why at this point we had to stop the training and get the weights which produced the lowest

324
00:24:23,050 --> 00:24:24,550
validation loss.

325
00:24:24,880 --> 00:24:26,440
And that's it for this section.

326
00:24:26,440 --> 00:24:30,490
In the next section, we are going to test out this model.