1
00:00:00,150 --> 00:00:06,570
Hello, everyone, and welcome to this new and exciting session in which we are going to see how to

2
00:00:06,570 --> 00:00:09,420
solve the problem of class imbalance.

3
00:00:10,020 --> 00:00:16,230
First of all, before trying to solve this problem, we have to first start by understanding what it

4
00:00:16,230 --> 00:00:17,220
is first.

5
00:00:17,310 --> 00:00:29,790
So here we have this data set or trained data set made of a 1525 data points for angry, 3019 for happy

6
00:00:29,940 --> 00:00:32,760
2255 for set.

7
00:00:34,020 --> 00:00:41,070
Now, just looking at this numbers, it's clear that any model you want to train on this data set will

8
00:00:41,070 --> 00:00:44,120
see more of happy images.

9
00:00:44,130 --> 00:00:53,880
And this automatically creates a bias in which the model now will be better or will do better on the

10
00:00:53,880 --> 00:00:59,640
happy images as compared to the angry and the sad images.

11
00:01:00,360 --> 00:01:07,680
And also, this model will do better on the sad images as compared to the angry images so we could put

12
00:01:07,680 --> 00:01:08,580
them in this order.

13
00:01:08,580 --> 00:01:12,880
One, two, three to confirm this hypothesis.

14
00:01:12,900 --> 00:01:16,350
Let's take a look at this confusion matrix right here.

15
00:01:16,650 --> 00:01:26,550
You'll see that this is the actual and this is the predicted then year we have correct predictions.

16
00:01:26,550 --> 00:01:29,350
So your correct predictions.

17
00:01:29,370 --> 00:01:32,620
Correct predictions and your correct predictions.

18
00:01:32,640 --> 00:01:36,210
But one thing you will notice is that this is angry.

19
00:01:36,690 --> 00:01:40,190
This is happy and this is sad.

20
00:01:40,200 --> 00:01:42,480
We might just draw this year, actually.

21
00:01:42,480 --> 00:01:44,160
So let's just put this out here.

22
00:01:44,190 --> 00:01:45,240
Here we go.

23
00:01:45,240 --> 00:01:52,200
And as we were saying, one thing you will notice is that the number of cases where the input image

24
00:01:52,200 --> 00:02:04,170
is actually happy and the model predicts angry or sad, that is, 19 and year 59 is going to be fewer

25
00:02:04,170 --> 00:02:13,380
than the number of cases where the model is or where the model is supposed to predict angry, but it

26
00:02:13,380 --> 00:02:16,070
instead predicts happy and sad.

27
00:02:16,080 --> 00:02:18,450
So here you have 58.

28
00:02:19,290 --> 00:02:21,420
And 78.

29
00:02:21,600 --> 00:02:27,940
And when you add this to up what you get, 136.

30
00:02:27,960 --> 00:02:36,600
And so this means that the model tends to make more errors when shown angry images as compared to when

31
00:02:36,600 --> 00:02:38,400
shown happy images.

32
00:02:38,610 --> 00:02:44,230
We could also look at the example of the sad images here for the sad.

33
00:02:44,250 --> 00:02:49,890
What we have here is 47 and 8147, 81.

34
00:02:50,160 --> 00:02:57,300
If we add this to up, we have 128 mistakes.

35
00:02:57,900 --> 00:02:58,590
And so on.

36
00:02:58,590 --> 00:03:05,700
The section would use a technique known as class weighting to try to reduce the effect of this bias,

37
00:03:05,700 --> 00:03:13,320
which has been created because of an uneven distribution in the number of samples per class in our dataset.

38
00:03:13,710 --> 00:03:22,080
Before we go on to explain how class weighting can be used to solve this problem of the class imbalance,

39
00:03:22,080 --> 00:03:33,320
what we want to make you understand is the very first thing you want to do is to build a balanced dataset.

40
00:03:33,330 --> 00:03:42,390
So if you have three classes, for example, angry, happy and sad, then you should gather as much

41
00:03:43,620 --> 00:03:51,790
cleaned data as possible from those three classes and also ensure that this class is are balanced.

42
00:03:51,810 --> 00:03:57,390
So, for example, you want to create some much more sophisticated system.

43
00:03:57,390 --> 00:04:01,170
We could gather 100,000 of this.

44
00:04:02,010 --> 00:04:03,090
100,000.

45
00:04:03,090 --> 00:04:03,900
We got a year.

46
00:04:03,900 --> 00:04:08,760
100,000 and your 100,000.

47
00:04:08,940 --> 00:04:18,090
So you should initially strive to have this balance nonetheless, in some cases or in some certain problems,

48
00:04:18,420 --> 00:04:21,480
getting this kind of balance will be very difficult.

49
00:04:22,110 --> 00:04:30,300
And so we'll just have to resort to a technique or one of the techniques that is class weighting in

50
00:04:30,300 --> 00:04:32,220
order to understand class weighting.

51
00:04:32,250 --> 00:04:38,340
Recall that you have the output level and then you have what the model outputs.

52
00:04:38,340 --> 00:04:40,850
Let's call this Y Chapo.

53
00:04:40,860 --> 00:04:45,740
So the model outputs this and this is what it's supposed to output.

54
00:04:45,750 --> 00:04:54,600
Now to be able to update the weights here, what we do is we compute this difference or this loss here

55
00:04:54,600 --> 00:05:01,110
between these two outputs, that's between the expected output and what the model outputs.

56
00:05:01,110 --> 00:05:08,940
And we are trying to or the aim here is to minimize this difference between this expected and actually

57
00:05:08,940 --> 00:05:10,200
predicted output.

58
00:05:11,280 --> 00:05:20,820
Now, since we know that the model has a tendency of favoring the images or the happy images, what

59
00:05:20,820 --> 00:05:32,610
we'll do is we are going to penalize the model more when it predicts a wrong output for an angry image.

60
00:05:33,150 --> 00:05:35,750
And so we'll penalize more.

61
00:05:35,760 --> 00:05:38,430
So how are we going to penalize this one?

62
00:05:38,460 --> 00:05:44,010
The errors from your more than the errors from your.

63
00:05:44,220 --> 00:05:54,930
And by doing so, the model's weights will try to be will be readjusted such that it doesn't favor another

64
00:05:54,930 --> 00:05:57,480
or favor any one of the classes.

65
00:05:58,920 --> 00:06:05,250
Now, that said, we are going to include the class weight here in this fit method.

66
00:06:05,250 --> 00:06:11,750
So we'll just have this class weights and then we'll have a class weights dictionary.

67
00:06:11,760 --> 00:06:16,740
So we're going to define this class with dictionary before this one year.

68
00:06:17,370 --> 00:06:20,700
We have three classes, so we have class zero.

69
00:06:20,700 --> 00:06:28,500
I will have his weight next would have class one and we would have his weights.

70
00:06:28,680 --> 00:06:32,280
Next we would have class two and we would have his weights.

71
00:06:32,280 --> 00:06:38,040
So here we have the angry, the happy and the sad class.

72
00:06:38,160 --> 00:06:45,720
Now, to obtain those weights, we can use a Formula one divided by the number of samples.

73
00:06:46,740 --> 00:06:57,990
And since here we have this number of samples for angry to be 1525 happy 301 nine set to 255, then

74
00:06:57,990 --> 00:07:07,350
one we calculate this weights here would have one divided by this you'll see that if we take one year,

75
00:07:07,350 --> 00:07:12,720
the weights will now be one divided by this number of samples.

76
00:07:13,320 --> 00:07:14,040
Zero.

77
00:07:14,040 --> 00:07:15,120
There we go.

78
00:07:15,240 --> 00:07:25,770
And then here we have one divided by number of samples for class one and one divided by number of samples

79
00:07:25,770 --> 00:07:27,120
for class two.

80
00:07:27,240 --> 00:07:35,580
Now, once we have this, we could also normalize this value so that there is no so much great difference

81
00:07:35,580 --> 00:07:40,110
or the values are in similar order of magnitude.

82
00:07:40,110 --> 00:07:44,460
So let's add this code so and then print out the class weights.

83
00:07:44,460 --> 00:07:50,040
We run this and we print out our class weights.

84
00:07:52,230 --> 00:07:53,070
There we go.

85
00:07:53,070 --> 00:07:56,430
We have this error number of samples, zero.

86
00:07:56,460 --> 00:08:00,180
Let's run this here and run this.

87
00:08:00,390 --> 00:08:01,800
Okay, So we have this.

88
00:08:01,800 --> 00:08:03,720
We have our class weights.

89
00:08:03,720 --> 00:08:07,140
You see this one, this and this.

90
00:08:07,170 --> 00:08:14,940
Okay, Now we're going to multiply each and every value by the sum of all the total number of data points

91
00:08:14,940 --> 00:08:15,690
we have.

92
00:08:16,680 --> 00:08:27,630
So yeah, we're going to multiply this by 6769 790 96799 and you're six, 799.

93
00:08:27,750 --> 00:08:28,590
There we go.

94
00:08:28,590 --> 00:08:30,450
You're six, seven, nine, nine.

95
00:08:30,900 --> 00:08:31,860
We have that.

96
00:08:31,860 --> 00:08:34,500
We run this, get this class weights.

97
00:08:34,500 --> 00:08:35,070
That's it.

98
00:08:35,070 --> 00:08:40,320
4.45, 2.253.01.

99
00:08:40,320 --> 00:08:50,400
So we see that your we have this class that the angry class has more weights compared to the happy and

100
00:08:50,400 --> 00:08:51,270
the sad.

101
00:08:51,270 --> 00:09:03,210
And this will now influence the model when updating its parameters since the loss will now punish the

102
00:09:03,210 --> 00:09:13,410
model more for making mistakes with inputs from this sample as compared to input from the sample or

103
00:09:13,410 --> 00:09:15,000
this other sample.

104
00:09:15,090 --> 00:09:15,930
So that's it.

105
00:09:15,930 --> 00:09:17,550
We have our class weights.

106
00:09:17,550 --> 00:09:22,500
We can now go ahead and add this to our training.

107
00:09:22,500 --> 00:09:23,850
So we have this.

108
00:09:23,850 --> 00:09:24,810
We've added this already.

109
00:09:24,810 --> 00:09:28,800
Even so, we just need to have this and then run this again.

110
00:09:28,800 --> 00:09:32,580
Now we'll start this pre trained, we start this.

111
00:09:32,580 --> 00:09:39,960
So we just from here we have our pre trained model, we'll recompile this, there we go.

112
00:09:40,140 --> 00:09:43,410
And then we start training for over 60 E books.

113
00:09:43,410 --> 00:09:48,960
So here we have trained and here are the results we get, We go ahead and evaluate the model.

114
00:09:48,960 --> 00:09:51,960
We have 84% accuracy.

115
00:09:52,320 --> 00:09:58,080
We test here, we have this different images which we pass.

116
00:09:58,080 --> 00:10:02,880
We, we actually make here just one error.

117
00:10:03,000 --> 00:10:04,050
So that's it.

118
00:10:04,050 --> 00:10:07,320
We now get to the confusion matrix, which we have plotted.

119
00:10:07,320 --> 00:10:14,190
And one thing we can notice straight away is the fact that the errors made by the model per class is

120
00:10:14,190 --> 00:10:16,080
more evenly distributed.

121
00:10:16,380 --> 00:10:19,910
So you could see here, let's have this here.

122
00:10:19,920 --> 00:10:30,530
You could see here that the model makes around 118 errors for the angry images, while for the happy

123
00:10:30,540 --> 00:10:36,030
makes about 100 and or rather 97 errors.

124
00:10:36,390 --> 00:10:41,760
And then for this it makes 148 errors.

125
00:10:42,870 --> 00:10:50,580
So although we have we still have this difference, which is inevitable now, this margin has been reduced

126
00:10:51,630 --> 00:10:55,980
and that's it for this section, the solving the problem of class imbalance.

127
00:10:55,980 --> 00:10:58,080
We have other techniques to.

128
00:10:58,350 --> 00:11:04,650
Remedy or solve this problem like over sampling and under sampling.

129
00:11:05,670 --> 00:11:14,430
In other sampling, what we have is we suppose now we have this initial data set, so let's just angry,

130
00:11:14,430 --> 00:11:16,820
happy and sad.

131
00:11:16,830 --> 00:11:25,590
And so what we are going to do here is we are going to randomly pick some elements from this angry sample

132
00:11:25,590 --> 00:11:32,460
right here and added up to the available data we already have.

133
00:11:32,460 --> 00:11:39,250
Your And so this all makes up for this remaining gap we have here and now.

134
00:11:39,250 --> 00:11:43,200
It matches up with the number of happy samples.

135
00:11:44,220 --> 00:11:52,140
And then for the sad samples, we again randomly pick some samples from here and add it up to this dataset

136
00:11:52,140 --> 00:12:01,080
so that we now have this data set which has which matches up with a number of samples for the happy

137
00:12:02,130 --> 00:12:08,550
now, whereas with down sampling, what we are instead going to do is we are going to cut this off here.

138
00:12:08,550 --> 00:12:13,770
So we'll cut this off and we'll cut this part of.

139
00:12:13,770 --> 00:12:18,510
So now what we're going to have is this year.

140
00:12:18,540 --> 00:12:21,360
So we're going to take off some parts of our data set.

141
00:12:21,360 --> 00:12:29,670
So now all of this match up instead with this angry class of our dataset.

142
00:12:29,670 --> 00:12:38,490
So now here we have that, we have the happy C matches up and then we have the sad.

143
00:12:39,700 --> 00:12:43,600
So that's our DOWNSAMPLING.

144
00:12:43,600 --> 00:12:49,450
And since this part which is taken off, is taken off randomly, we call this random DOWNSAMPLING.

145
00:12:49,990 --> 00:12:57,340
Nonetheless, it's not advisable to use over sampling or under sampling when solving real world problems

146
00:12:57,340 --> 00:13:06,100
as either you're adding already existing data like this, which is not necessarily a great idea, or

147
00:13:06,100 --> 00:13:10,920
you even removing data from your dataset, which is not also a great idea.

148
00:13:10,930 --> 00:13:17,410
So if you have a problem of class imbalance, the very first thing which comes to your mind or we should

149
00:13:17,410 --> 00:13:25,660
come to your mind should be that of trying to balance this data by gathering much more data from a particular

150
00:13:25,660 --> 00:13:26,350
class.