1
00:00:00,390 --> 00:00:02,520
Welcome to this new section.

2
00:00:02,940 --> 00:00:09,810
We should start by looking at the error sanction loss function, which I'll use in training our model.

3
00:00:10,140 --> 00:00:17,430
This function others loss function happens to be the Euclidean loss which is derived from the Euclidean

4
00:00:17,430 --> 00:00:19,710
distance, and it goes as such.

5
00:00:20,190 --> 00:00:29,970
For every pixel we have in our density map, we compare it with the true density map that was a target

6
00:00:29,970 --> 00:00:30,840
density map.

7
00:00:30,840 --> 00:00:38,670
So by doing that we have y true I this is for each I where I goes from one to M, where M represents

8
00:00:38,670 --> 00:00:42,150
all the different pixels we could have in the density map.

9
00:00:42,360 --> 00:00:45,900
Now, note that this m doesn't represent the by size.

10
00:00:45,900 --> 00:00:49,530
It represents the total number of pixels nor density map.

11
00:00:49,530 --> 00:00:56,610
So for each of those pixels we have y target minus y predicted squared, and then we simply add this

12
00:00:56,610 --> 00:00:57,900
or sum this up.

13
00:00:57,900 --> 00:01:01,440
Then at the end the answer we get, we find a square root.

14
00:01:02,130 --> 00:01:07,140
Now, coming back to the code, all we need to do is to define this custom loss as such.

15
00:01:07,830 --> 00:01:12,850
Once this custom loss is defined, we go ahead to set the learning rate.

16
00:01:12,870 --> 00:01:16,080
We compile the model and we'll define the check point.

17
00:01:16,110 --> 00:01:19,980
Now, yeah, instead of this, we should have CSR net.

18
00:01:19,980 --> 00:01:21,330
So there we go.

19
00:01:21,630 --> 00:01:26,790
We could start with this, run that, and now we're ready to train our model.

20
00:01:26,790 --> 00:01:28,230
So we have this.

21
00:01:29,490 --> 00:01:33,930
We train the we have our train generator, which we define already.

22
00:01:34,080 --> 00:01:37,470
We shuffle a number of epochs on the callback.

23
00:01:37,470 --> 00:01:38,870
So there we go.

24
00:01:38,880 --> 00:01:43,310
Know that we shall train on just ten images from our dataset.

25
00:01:43,320 --> 00:01:52,440
Then you could simply increase this to say 380 or 350.

26
00:01:52,830 --> 00:01:55,530
Then the validation of, say, 50.

27
00:01:57,630 --> 00:01:59,480
Now let's check on the arrow we have.

28
00:01:59,490 --> 00:02:03,980
We told you that we haven't this Sigma.

29
00:02:04,080 --> 00:02:07,050
Let's go up and check right here.

30
00:02:08,190 --> 00:02:09,020
We okay.

31
00:02:09,030 --> 00:02:11,400
We didn't do that, so we should have.

32
00:02:12,420 --> 00:02:14,490
This was the Gaussian radius.

33
00:02:14,490 --> 00:02:16,020
So we should have the Gaussian.

34
00:02:16,020 --> 00:02:19,560
Reduce Gaussian radius.

35
00:02:20,520 --> 00:02:21,690
Run it again.

36
00:02:21,690 --> 00:02:22,680
That's fine.

37
00:02:23,130 --> 00:02:25,610
Our models now train and quiet normally.

38
00:02:25,620 --> 00:02:32,250
Now, note that the reason why we decide to work with a relatively low learning rate is because we are

39
00:02:32,250 --> 00:02:37,230
using fine tuning in this VG model right here.

40
00:02:37,230 --> 00:02:44,070
So we're trying not to avoid distorting the already trained weights of the VG model.

41
00:02:44,070 --> 00:02:52,320
And also note that the reason why we don't selectively keep aside the batch normalization layers is

42
00:02:52,320 --> 00:02:56,370
because the Vedge model, there are no batch normalization layers.

43
00:02:57,090 --> 00:02:59,250
So with that, we look at our training.

44
00:02:59,730 --> 00:03:01,230
Everything looks fine.

45
00:03:02,310 --> 00:03:10,770
Also, note that the most common metric in this people counts and models is an accuracy.

46
00:03:10,890 --> 00:03:15,680
We could actually use the mean average error or the mean square error.

47
00:03:15,690 --> 00:03:18,990
So we just work with the mean squared, mean average error for now.

48
00:03:18,990 --> 00:03:19,950
So that's it.

49
00:03:21,790 --> 00:03:26,260
Now, while we were training our model, we could go ahead to put out a code for prediction.

50
00:03:26,260 --> 00:03:26,740
So.

51
00:03:26,740 --> 00:03:27,070
Right.

52
00:03:27,070 --> 00:03:27,280
Yeah.

53
00:03:27,280 --> 00:03:30,910
We have this input image just like what we had done previously.

54
00:03:31,120 --> 00:03:32,560
Let's take this off.

55
00:03:32,860 --> 00:03:43,960
We have the image, we load the image, we normalize it by fast dividing by 255 and then we subtract

56
00:03:43,960 --> 00:03:44,350
the mean.

57
00:03:44,350 --> 00:03:50,230
So it's just like having X minus the mean divided by the standard deviation.

58
00:03:50,230 --> 00:03:51,220
So that's it.

59
00:03:51,400 --> 00:03:54,670
So that's kind of like what we're doing.

60
00:03:54,670 --> 00:04:01,300
So we have x equals x minus the mean minus divided by the standard deviation.

61
00:04:01,300 --> 00:04:04,270
So for I, we do this for each of the channels.

62
00:04:04,270 --> 00:04:10,120
That's why you have this right here, zero one and two because our image has three channels.

63
00:04:10,120 --> 00:04:16,810
So for each channel we take each value minus the mean of the channel divided by the standard deviation

64
00:04:16,810 --> 00:04:17,410
of that channel.

65
00:04:17,410 --> 00:04:21,850
So that's we take off the non pass array since this is going to generate a ten.

66
00:04:21,850 --> 00:04:25,180
So so we get back to our training.

67
00:04:27,710 --> 00:04:29,120
Let's pause this from now.

68
00:04:29,120 --> 00:04:31,370
So stop this canal for now.

69
00:04:31,370 --> 00:04:35,810
And then we try to predict using this image from our dataset.

70
00:04:35,960 --> 00:04:36,860
So there we go.

71
00:04:36,860 --> 00:04:45,440
We see that this looks quite similar to the actual density map, which we had seen previously, which

72
00:04:45,440 --> 00:04:48,740
was now this was no, this was 245.

73
00:04:48,740 --> 00:04:50,810
So let's pick image one.

74
00:04:51,380 --> 00:04:52,160
So that is it.

75
00:04:52,160 --> 00:04:57,980
So it looks quite similar to this density map or we just generated right here the predicted density

76
00:04:57,980 --> 00:04:58,570
map.

77
00:04:58,580 --> 00:05:05,750
But then once this counting is done, that is once we have the sum of the outputs, we obtained 186

78
00:05:05,750 --> 00:05:07,160
instead of 233.

79
00:05:07,160 --> 00:05:13,580
So it means we need to continue training and then to come up with a model which doesn't over feed on

80
00:05:13,580 --> 00:05:15,440
this few images.

81
00:05:15,440 --> 00:05:20,210
We need to increase our data set to take in many more images.

82
00:05:20,210 --> 00:05:28,280
We also continue trading again up to this loss of 0.77 and then running this, we obtain this result.

83
00:05:28,280 --> 00:05:29,960
So we see that it's getting better.

84
00:05:30,140 --> 00:05:32,810
Now to get better results.

85
00:05:32,810 --> 00:05:38,270
Obviously just need to continue training, but this will always feed on the few images we've trained,

86
00:05:38,270 --> 00:05:46,250
so we still have to include many more images and then use data augmentation techniques to increase our

87
00:05:46,250 --> 00:05:46,520
data.
