1
00:00:00,150 --> 00:00:06,760
Hello, everyone, and welcome to this new session in which we'll treat mixed sample data augmentation.

2
00:00:06,780 --> 00:00:14,640
Previously, we saw how to do data augmentation on a single image like this or this one.

3
00:00:14,850 --> 00:00:23,670
And now we'll learn how to create new samples based on a combination of mixture of different images

4
00:00:23,670 --> 00:00:26,370
or different samples from our dataset.

5
00:00:26,370 --> 00:00:33,330
And more specifically, will treat the mix up data augmentation strategy, where we'll pick two samples

6
00:00:33,330 --> 00:00:35,130
from the data set.

7
00:00:35,280 --> 00:00:42,080
We're going to mix up the samples, and then we're going to include this strategy in our TF data pipeline.

8
00:00:42,090 --> 00:00:49,080
Up to this point, we've implemented data augmentation strategies, which involve modifying input samples

9
00:00:49,080 --> 00:00:50,070
like this one.

10
00:00:50,070 --> 00:00:56,370
So we could take this input sample, we could do a zoom, we could do a center crop, we could rotate,

11
00:00:56,370 --> 00:01:03,510
we could translate and do many other stuff with this in order to augment our already existing data.

12
00:01:03,900 --> 00:01:10,410
Now, in this section, we'll look at another data augmentation strategy, which is known as mix up.

13
00:01:10,560 --> 00:01:20,910
Now, mix up doesn't only involve just one sample, as we had seen previously with mix up, We are going

14
00:01:20,910 --> 00:01:25,170
to make use of this two samples instead of just one.

15
00:01:25,290 --> 00:01:30,360
Mix them up and then produce this output sample.

16
00:01:30,360 --> 00:01:36,330
And if you look very carefully, you'll notice that this image contains this one.

17
00:01:36,330 --> 00:01:42,780
You could see here, you could see it carved out, this dog carved out like this you see here, and

18
00:01:42,780 --> 00:01:45,870
then this other dog right here.

19
00:01:45,870 --> 00:01:48,930
So we also have this one, which is also carved out like this.

20
00:01:48,930 --> 00:01:54,330
So we could see the mixture of this two to form one input.

21
00:01:54,510 --> 00:02:02,280
So we have this output image which you could define as x prime, which is a mixture or a combination

22
00:02:02,280 --> 00:02:04,880
of this image x one and x two.

23
00:02:04,890 --> 00:02:08,850
So yeah, let's have x one and x two.

24
00:02:09,360 --> 00:02:13,230
But this actually happens to be a weighted addition.

25
00:02:13,230 --> 00:02:21,150
That is, we have a certain factor, Lambda, which is a value between zero and one and drawn from the

26
00:02:21,150 --> 00:02:31,470
beta distribution such that we have x prime equals lambda times x one plus one minus lambda times x

27
00:02:31,470 --> 00:02:41,070
to this means if lambda equals 0.5, then we'll have 0.5 x one plus 0.5 x two lambda equals 0.3.

28
00:02:41,070 --> 00:02:47,340
For example, we have 0.3 x one plus 0.7 x two.

29
00:02:47,340 --> 00:02:49,800
So some sort of weighted addition.

30
00:02:49,800 --> 00:02:51,720
Now we've created this new input.

31
00:02:51,720 --> 00:02:58,560
It's logical that we need to modify the labels because this doesn't belong to either or.

32
00:02:58,560 --> 00:03:02,460
You cannot really say that this belongs to this class or this class.

33
00:03:02,460 --> 00:03:05,070
It actually is a mixture of both classes.

34
00:03:05,070 --> 00:03:11,970
So unlike previously, when we do data augmentation on this image, we are going to maintain the label

35
00:03:11,970 --> 00:03:16,170
because it still remains this particular class.

36
00:03:16,170 --> 00:03:21,750
But when we mix classes up together, our images from two different classes together like this, we

37
00:03:21,750 --> 00:03:23,280
need to modify the level.

38
00:03:23,280 --> 00:03:33,570
And so that said, we have a new label Y Prime equal lambda and you guessed that right Y one plus one

39
00:03:33,570 --> 00:03:36,420
minus lambda y two.

40
00:03:37,080 --> 00:03:44,430
And from here we can now dive into the code and implement this mix up that augmentation strategy.

41
00:03:44,700 --> 00:03:52,950
Recall we said that Lambda was to be drawn from a beta distribution, but if you come to the documentation

42
00:03:52,950 --> 00:04:01,260
we'll be using so far where we have this TensorFlow that ARG API docs, you wouldn't really find this

43
00:04:01,260 --> 00:04:02,850
beta distribution.

44
00:04:02,850 --> 00:04:07,020
So you are advised to look out in this one year.

45
00:04:07,020 --> 00:04:11,130
So go to tensorflow dot org slash probability.

46
00:04:11,130 --> 00:04:14,340
Instead it's here you would find this distribution.

47
00:04:14,340 --> 00:04:24,750
So here we have this TensorFlow probabilities and then you have TensorFlow distributions.

48
00:04:24,750 --> 00:04:31,080
When you click here, you could find many probability distributions, including this beta distribution,

49
00:04:31,080 --> 00:04:31,980
which we want to work.

50
00:04:31,980 --> 00:04:35,490
And now, now just right here you could see the beta distribution.

51
00:04:35,490 --> 00:04:41,640
You have the definition and then notice how we have this two parameters which we must pass.

52
00:04:41,640 --> 00:04:47,040
So the beta distribution is defined over zero one this interval zero one using parameters, concentration

53
00:04:47,040 --> 00:04:50,730
one a.k.a alpha and concentration zero a beta.

54
00:04:50,730 --> 00:04:54,300
So we have the parameters, alpha and beta to pass right here.

55
00:04:55,410 --> 00:04:59,220
Now, if you look up in this mix up paper, you would see that the.

56
00:04:59,640 --> 00:05:02,580
It is all for the youths like you're from.

57
00:05:02,580 --> 00:05:08,280
This experiments is 0.2 and then sometimes you use 0.42.

58
00:05:08,280 --> 00:05:11,040
But most times the test had on 0.2.

59
00:05:11,040 --> 00:05:16,650
So its this parameter will be using and then getting back here.

60
00:05:17,490 --> 00:05:19,740
All we need to do now is copy this out.

61
00:05:19,740 --> 00:05:27,750
So we copy this, get back to our code and then we have the mix up down here.

62
00:05:27,750 --> 00:05:32,340
So here we have this mix up, mix up the term mutation.

63
00:05:32,340 --> 00:05:40,320
Let's reduce this and then we reduce this part so you wouldn't run the cells because we're not going

64
00:05:40,320 --> 00:05:41,850
to make use of this now.

65
00:05:41,850 --> 00:05:43,890
So yeah, we have this mix up.

66
00:05:43,890 --> 00:05:49,950
Let's test out this, we run it and we have this error name.

67
00:05:49,950 --> 00:05:51,360
Construction is not defined.

68
00:05:51,560 --> 00:05:57,420
Anyway, we could get back here and then import TensorFlow probability.

69
00:05:57,420 --> 00:06:05,190
So let's import TensorFlow probability and then put that as TFP.

70
00:06:05,580 --> 00:06:08,250
We run that and then get back to our mix up.

71
00:06:08,640 --> 00:06:11,040
Okay, so here we have this mix up.

72
00:06:11,040 --> 00:06:14,430
Let's take this off and let's take all this off.

73
00:06:14,430 --> 00:06:17,880
Actually, 0.20.2.

74
00:06:17,880 --> 00:06:18,390
Okay.

75
00:06:18,480 --> 00:06:20,160
So we got that from the paper.

76
00:06:20,160 --> 00:06:21,120
That's understood.

77
00:06:21,120 --> 00:06:22,920
And then we have lambda.

78
00:06:23,160 --> 00:06:25,140
So here we have lambda.

79
00:06:25,140 --> 00:06:28,080
If we spell it this way, we have the keyword, the python keyword.

80
00:06:28,080 --> 00:06:32,880
So let's just keep that simple and just have this lambda spelt or spell wrongly.

81
00:06:32,880 --> 00:06:38,910
And now we'll do lambda plus print out lambda lambda dot sample.

82
00:06:38,910 --> 00:06:46,200
So we take, pick out one sample from that beta distribution and we run that and here is what we get.

83
00:06:46,200 --> 00:06:52,230
So you see, if you run this again, you obviously get different outputs and all this drawn from the

84
00:06:52,230 --> 00:06:55,770
beta distribution and in the range zero one.

85
00:06:55,770 --> 00:07:01,830
So you're I'm just going to do this, pick out this zero item and then we have this output.

86
00:07:01,980 --> 00:07:06,450
We could also put this nom py to get it.

87
00:07:07,350 --> 00:07:14,100
See you have this now so you know not having a tensor, but anyway, we prefer to use it as a tensor

88
00:07:14,100 --> 00:07:20,610
and we'll explain why we'll it's preferable to work with tenses in the function we're trying to build.

89
00:07:21,650 --> 00:07:24,080
From here, we just simply apply the formula.

90
00:07:24,080 --> 00:07:40,370
So we'll have the output image, which is equal lambda times, the image one plus one minus lambda times

91
00:07:40,370 --> 00:07:42,110
the image two.

92
00:07:42,470 --> 00:07:46,190
So that's it for the image would repeat the same process for the labeling.

93
00:07:47,030 --> 00:07:53,960
Then using open CV, we'll test this on this two images which we've added here we have this dog and

94
00:07:53,960 --> 00:07:55,880
this cat image right here.

95
00:07:55,880 --> 00:07:56,960
So we have this tool.

96
00:07:56,960 --> 00:07:58,280
We're going to test this on it.

97
00:07:58,820 --> 00:08:01,910
Let's take this off right here.

98
00:08:01,910 --> 00:08:03,710
We have we're going to read the images.

99
00:08:03,710 --> 00:08:05,870
So we have image one.

100
00:08:06,620 --> 00:08:12,410
We're going to do the reading in read, and then we'll do the same for him, too.

101
00:08:12,410 --> 00:08:12,770
So.

102
00:08:12,770 --> 00:08:13,010
Right.

103
00:08:13,010 --> 00:08:13,280
Yeah.

104
00:08:13,280 --> 00:08:20,180
We could print out image the shape and label that shape.

105
00:08:20,180 --> 00:08:23,660
Or rather, let's just print out a label so we have that.

106
00:08:24,560 --> 00:08:26,060
We get in this error.

107
00:08:26,690 --> 00:08:29,780
Let's get back up and re correct this.

108
00:08:29,780 --> 00:08:34,250
So this actually lambda, let's modify a lambda lambda equals this.

109
00:08:34,250 --> 00:08:35,750
So let's have that out.

110
00:08:36,200 --> 00:08:37,460
So we have that right.

111
00:08:37,460 --> 00:08:46,460
We run this again and require broadcast table shapes, which happens because we haven't yet resized

112
00:08:46,460 --> 00:08:48,200
this, because here we have image.

113
00:08:48,200 --> 00:08:55,520
If we print out the shape of image one and image two before doing this operation, you see there are

114
00:08:55,520 --> 00:08:56,750
two different shapes.

115
00:08:56,750 --> 00:08:58,960
So we have to ensure that they are about the same.

116
00:08:58,970 --> 00:09:00,130
You see, there are two different.

117
00:09:00,140 --> 00:09:08,120
Now let's resize this with CV to resize resize and then we specify the shape.

118
00:09:08,120 --> 00:09:13,730
So let's have your size in size M size.

119
00:09:14,180 --> 00:09:15,830
Okay, we have that done.

120
00:09:15,860 --> 00:09:17,330
We just copy this out.

121
00:09:17,840 --> 00:09:24,290
Paste Out here we have size and then CV to resize.

122
00:09:24,620 --> 00:09:28,700
Okay, so we've read, we've resized and that should be fine.

123
00:09:28,700 --> 00:09:32,690
Now we run that again, same size, not defined.

124
00:09:32,690 --> 00:09:34,880
Let's have that to be defined here.

125
00:09:36,110 --> 00:09:40,850
We actually defined this previously, but we haven't run those previous cells since we started a notebook.

126
00:09:40,850 --> 00:09:45,080
So that's why it isn't recognizing the size.

127
00:09:45,560 --> 00:09:50,630
Okay, now level one or define this looks great already for this levels.

128
00:09:50,630 --> 00:09:53,750
Let's say we have level one.

129
00:09:54,080 --> 00:10:02,120
So yeah, we have level one equals zero and then level to equal one.

130
00:10:02,120 --> 00:10:04,490
So we have just this two levels.

131
00:10:05,240 --> 00:10:09,380
Okay, we run that again and this is what we get.

132
00:10:09,380 --> 00:10:17,120
You see, we have this output image and then we have this final level and which happens to be near zero.

133
00:10:17,140 --> 00:10:19,280
No one from here.

134
00:10:19,280 --> 00:10:21,200
We could plot this out.

135
00:10:21,500 --> 00:10:27,050
PLT m show and then we pass in the image and normalize this.

136
00:10:27,050 --> 00:10:29,720
We run that and this is what we get.

137
00:10:29,720 --> 00:10:39,050
So you see, we have a mix up of this two images now that we have succeeded to do this, let's make

138
00:10:39,050 --> 00:10:48,020
this part of our TensorFlow pipeline so we could take all this out now and then define this mix up method.

139
00:10:48,020 --> 00:10:50,000
We have this level.

140
00:10:50,000 --> 00:10:51,310
Okay, we'll take this off.

141
00:10:51,320 --> 00:10:51,700
Okay.

142
00:10:51,710 --> 00:10:53,900
We're going to define this method.

143
00:10:53,900 --> 00:10:55,460
Let's call it mix up.

144
00:10:55,460 --> 00:10:59,570
And the way this method works is we're going to take in our data.

145
00:10:59,570 --> 00:11:07,550
So our trained data set here, we have one and then train data set two.

146
00:11:07,760 --> 00:11:14,810
So we have this two data sets which contain the same elements by which I've been shuffled so that we

147
00:11:14,810 --> 00:11:17,660
could have this kind of mix up.

148
00:11:18,200 --> 00:11:25,040
Then we can now make this data sets available which will do the mix up on.

149
00:11:25,040 --> 00:11:32,360
So let's add some code, take this up and then yeah, we have our first train data set.

150
00:11:32,360 --> 00:11:38,690
We have train data set one, which is actually our train dataset we have built already.

151
00:11:40,010 --> 00:11:42,680
And so we're getting our train data set right from here.

152
00:11:42,680 --> 00:11:50,600
We've run this already, we get back, be careful, we are not running this, but we may make use of

153
00:11:50,600 --> 00:11:52,190
one or two methods from this.

154
00:11:52,190 --> 00:11:55,460
So we have that and then we get back to this.

155
00:11:55,460 --> 00:11:57,080
So here we have trained this set.

156
00:11:57,080 --> 00:12:06,050
We do some shuffling, we specify the buffer size, and then we also specify that we're going to reshuffle

157
00:12:06,050 --> 00:12:07,880
after each iteration.

158
00:12:08,060 --> 00:12:12,650
We just copy this and pace out here to have our trained data set to.

159
00:12:12,680 --> 00:12:20,730
Now, once we have this train that is set to, we now have our train, let's call this mix dataset or

160
00:12:20,750 --> 00:12:21,200
mixed.

161
00:12:21,410 --> 00:12:22,270
Dataset.

162
00:12:23,090 --> 00:12:28,010
We have our mixed data set and then we make use of the zip method.

163
00:12:28,010 --> 00:12:37,040
And with the zip we are going to pass in the train dataset one and then the train dataset two.

164
00:12:38,120 --> 00:12:44,570
Now, if you could remember, we had an error when we pass into images which had two different shapes.

165
00:12:44,570 --> 00:12:48,820
So we have to ensure that we do some preprocessing before doing the mix up.

166
00:12:48,830 --> 00:12:53,390
So that said, after doing the shuffling, we could do the preprocessing.

167
00:12:53,390 --> 00:13:02,640
So let's say pre process, let's get back up where we define this pre processing and we had your okay,

168
00:13:02,660 --> 00:13:07,520
so we had this level of data augmentation, we had preprocessing, although we had inserted this in

169
00:13:07,520 --> 00:13:13,220
our augment, but it's practically this resize rescale method here.

170
00:13:13,220 --> 00:13:16,160
So we could run this and I'll be fine.

171
00:13:16,160 --> 00:13:20,450
And then we just do resize rescale so we wouldn't call it appropriate processing.

172
00:13:20,450 --> 00:13:24,980
Again, we just have resize rescale as we've done already.

173
00:13:24,980 --> 00:13:26,750
We do the same mapping here.

174
00:13:26,750 --> 00:13:29,780
We have resize and rescale.

175
00:13:29,810 --> 00:13:37,310
Okay, So now we shuffle, we resize and rescale, and then we have our dieter, which is now a combination

176
00:13:37,310 --> 00:13:41,180
of dataset one and dataset two.

177
00:13:41,180 --> 00:13:44,620
This gif was gotten from Giphy.com.

178
00:13:44,630 --> 00:13:47,690
Now we have this mixed dataset formed.

179
00:13:47,690 --> 00:13:49,100
We run the cell.

180
00:13:49,100 --> 00:13:50,000
That's fine.

181
00:13:50,000 --> 00:13:52,100
And then in here, let's take this off.

182
00:13:52,100 --> 00:13:55,160
We've run this already, so we have that.

183
00:13:55,160 --> 00:13:58,400
And then yeah, we're going to take in image one.

184
00:13:58,400 --> 00:14:05,090
So image one and label one, label one.

185
00:14:05,090 --> 00:14:11,570
We have that this tuples, we have image two and then label two.

186
00:14:11,870 --> 00:14:13,190
Let's close this up.

187
00:14:13,640 --> 00:14:24,170
Okay, so we have that and then we get this from the train dataset, one dataset, one and the train

188
00:14:24,170 --> 00:14:26,330
dataset two.

189
00:14:26,420 --> 00:14:27,950
So that's how we get this.

190
00:14:27,950 --> 00:14:31,400
We have image one, level one, image two, label two.

191
00:14:31,400 --> 00:14:37,790
The image one that we had here, we done this again, we have Lambda, we get lambda, we get the image,

192
00:14:37,790 --> 00:14:40,130
we have the label, and then we have our output.

193
00:14:40,130 --> 00:14:42,950
So we return image and label.

194
00:14:42,950 --> 00:14:44,930
So this is all about the mix up.

195
00:14:44,930 --> 00:14:49,400
Now we run the cell and then we create this other new cell.

196
00:14:49,430 --> 00:14:57,950
Yeah, we have this error, should have this, and then we create this new cell right here.

197
00:14:59,240 --> 00:15:02,180
Then we pass out from what we had done already.

198
00:15:02,180 --> 00:15:05,690
And then here we have our mixed data set.

199
00:15:05,690 --> 00:15:07,730
So we now have this mixed dataset.

200
00:15:07,730 --> 00:15:13,340
We shuffle again, we do the mapping with the augment layer, we do batching and perfection.

201
00:15:14,210 --> 00:15:19,280
But since our AUGMENT is no more the previous augmented layer we had, yeah, we have now the mix up.

202
00:15:19,280 --> 00:15:22,490
So yeah, we replaced that with mix up and that should be fine.

203
00:15:22,490 --> 00:15:26,210
We have the train, we could do the same for the validation.

204
00:15:26,210 --> 00:15:31,190
We get an error spelling error train data set.

205
00:15:31,190 --> 00:15:35,090
Let's get back to this train dataset.

206
00:15:35,120 --> 00:15:35,690
Okay.

207
00:15:35,690 --> 00:15:43,370
So yeah, we have the train data set, we run that, run this again, we have here input y of small

208
00:15:43,370 --> 00:15:49,400
operation has type in 64 that doesn't match the type flow 32 of argument X.

209
00:15:49,400 --> 00:15:53,180
So this is where we multiplying the lambda by the labels.

210
00:15:53,180 --> 00:16:00,920
So clearly we have the lambda which is a float and labels which are ints.

211
00:16:00,920 --> 00:16:03,170
So yeah, we're just going to cast this.

212
00:16:03,170 --> 00:16:11,540
So we have this casting, we specify the D type float 32 that's fine.

213
00:16:11,540 --> 00:16:20,810
And then right here we do same casting, specify the D type again and that's it.

214
00:16:20,810 --> 00:16:25,700
Let's run this again and see what we get is he works fine.

215
00:16:25,700 --> 00:16:31,040
We have this one is also note that yeah, we were trying to experiment and change this.

216
00:16:31,040 --> 00:16:34,100
So let's get back to the point to run that again.

217
00:16:34,760 --> 00:16:40,340
Okay, So now we have our training data and let's see here we have the batch and then the image and

218
00:16:40,340 --> 00:16:42,650
then the batch and then the label.

219
00:16:42,650 --> 00:16:43,520
So that's it.

220
00:16:43,520 --> 00:16:49,640
We've now created this data set, which happens to be a mixed dataset.

221
00:16:49,880 --> 00:16:54,710
Then from here we are going to prepare our validation data set is actually the same as what we had already.

222
00:16:54,710 --> 00:16:58,580
So you could just simply run this previous cell right here.

223
00:16:58,580 --> 00:17:05,750
This cell lines data loading, just simply run this, run this and you should be fine instead of doing

224
00:17:05,750 --> 00:17:06,650
this augmented layer.

225
00:17:06,650 --> 00:17:08,660
We meant to just do resize rescale.

226
00:17:08,660 --> 00:17:11,900
We just have to resize and rescale our validation data.

227
00:17:11,900 --> 00:17:13,730
We don't really need to shuffle.

228
00:17:13,760 --> 00:17:14,930
We could take that off.

229
00:17:14,930 --> 00:17:17,030
We could take the fashion off, and that's fine.

230
00:17:17,030 --> 00:17:20,840
So we run our validation and then check it out here.

231
00:17:20,930 --> 00:17:21,110
So you.

232
00:17:21,190 --> 00:17:22,370
You have a validation theater.

233
00:17:22,390 --> 00:17:25,410
Now, the reason why we have this is because we run this twice.

234
00:17:25,420 --> 00:17:29,410
So let's re initialize this right here.

235
00:17:29,950 --> 00:17:31,480
Let's get back to the splits.

236
00:17:31,750 --> 00:17:35,800
We create a trained theater and validation data.

237
00:17:35,920 --> 00:17:37,810
Okay, let's run this again here.

238
00:17:37,840 --> 00:17:39,460
Now, everything should be fine.

239
00:17:39,680 --> 00:17:40,540
Okay, so you have that.

240
00:17:40,540 --> 00:17:44,150
You have the batch dimension, and that's good.

241
00:17:44,170 --> 00:17:47,890
So we now get back to training and make sure everything is okay.

242
00:17:48,910 --> 00:17:52,710
We rerun this and everything is now fine.

243
00:17:53,170 --> 00:17:55,380
So we're now set to train our model.

244
00:17:55,390 --> 00:17:59,580
So we run our sequential API right here.

245
00:17:59,590 --> 00:18:05,800
We run this, we then compile our model, and then we could get ready to train the model, have this

246
00:18:05,800 --> 00:18:07,060
poor results.

247
00:18:07,060 --> 00:18:14,890
Reason being that the mix up data augmentation strategy isn't adapted to the data set we're working

248
00:18:14,890 --> 00:18:15,120
with.

249
00:18:15,130 --> 00:18:21,160
Even if the mix up, the documentation strategy we've just applied wasn't very helpful for this particular

250
00:18:21,160 --> 00:18:21,910
problem.

251
00:18:21,910 --> 00:18:28,360
Is important to note that this mix up strategy could be used in many other problems and we that will

252
00:18:28,360 --> 00:18:29,730
come to the end of this section.

253
00:18:29,740 --> 00:18:33,280
Thank you for getting up to this point and see you next time.