1
00:00:00,060 --> 00:00:00,570
Hi, guys.

2
00:00:00,600 --> 00:00:01,950
Welcome back to the course.

3
00:00:02,460 --> 00:00:06,540
This is going to be our first object detection lesson.

4
00:00:07,230 --> 00:00:08,400
So let's get started.

5
00:00:08,490 --> 00:00:15,270
So the first model we're going to trend this object detector is a skilled YOLO v4, and this one is

6
00:00:15,270 --> 00:00:16,040
a very cool one.

7
00:00:16,240 --> 00:00:18,510
I started with a pretty cool project, in my opinion.

8
00:00:19,020 --> 00:00:22,440
This one is a gun and pistol detection detector.

9
00:00:23,070 --> 00:00:24,660
So let's take a look.

10
00:00:24,870 --> 00:00:30,900
So open up Notebook 42, which is highlighted here, and that'll bring up this window.

11
00:00:31,560 --> 00:00:33,230
So this is a lesson here.

12
00:00:33,240 --> 00:00:41,280
So firstly, like I mean, I've mentioned before all of these notebooks adopted from rebel floor, they

13
00:00:41,280 --> 00:00:46,140
may not be the latest notebooks that are available right now in rubber floor site because they're constantly

14
00:00:46,470 --> 00:00:49,350
tweaking and making little changes to these notebooks.

15
00:00:49,800 --> 00:00:54,480
So what I did this was back in 2021, believe November.

16
00:00:55,290 --> 00:01:01,020
I captured those notebooks and I may have made some minor changes because some of those things, some

17
00:01:01,020 --> 00:01:03,440
of their notebooks are broken.

18
00:01:03,450 --> 00:01:07,110
Unfortunately, they do get around to fixing it quite quick.

19
00:01:07,140 --> 00:01:12,450
So if you do encounter one of the newer ones that are broken, you can just send them a message and

20
00:01:12,450 --> 00:01:14,220
they'll probably update it quite quickly.

21
00:01:15,600 --> 00:01:24,000
So this one here is a skilled healer before, and it's going to be trained on the Pistol Pistols dataset.

22
00:01:24,660 --> 00:01:29,790
As you can see here, this is an example of how the dataset looks, and we have pictures of guns where

23
00:01:29,790 --> 00:01:32,580
we have the bounding boxes annotated around it.

24
00:01:32,670 --> 00:01:33,820
This is our training data.

25
00:01:33,840 --> 00:01:38,910
This is what we'll use to create our very own gun and pistol detector.

26
00:01:39,540 --> 00:01:44,340
So you can see how all of these things, these the guns here in this footage just as a robbery.

27
00:01:45,090 --> 00:01:48,990
Most of these look like studio or movie shots or stock images.

28
00:01:49,500 --> 00:01:54,570
Nevertheless, that's still useful because they all have guns in them, and you're not going to encounter

29
00:01:54,570 --> 00:01:59,040
a real life gun dataset unless someone put in the works to build one.

30
00:01:59,160 --> 00:02:00,030
It's possible.

31
00:02:00,030 --> 00:02:07,410
It's a lot of footage on YouTube of like CCTV robberies, especially in the US, where there is no gun

32
00:02:07,410 --> 00:02:07,980
control.

33
00:02:09,150 --> 00:02:10,890
That's not a political opinion.

34
00:02:10,890 --> 00:02:12,180
That's just a fact.

35
00:02:12,810 --> 00:02:14,640
So let's start this notebook.

36
00:02:15,660 --> 00:02:17,820
So one of them, the first is not books.

37
00:02:18,330 --> 00:02:20,110
They they're a bit heavy.

38
00:02:20,130 --> 00:02:26,820
There's a lot of setup going on because using your look for you'll five all of these abductors.

39
00:02:27,300 --> 00:02:32,160
They're not actually meant to be run in notebooks because they got a lot of things going on in the background.

40
00:02:32,640 --> 00:02:39,180
A lot of parts that need to be build a lot of different configurations, so it looks a bit messy when

41
00:02:39,180 --> 00:02:40,110
it's in a notebook.

42
00:02:40,200 --> 00:02:46,890
But nevertheless, this is perhaps the easiest way to get started with this object detector notebooks.

43
00:02:47,520 --> 00:02:51,720
So I'm not going to run this code here because right now I've run these before.

44
00:02:52,320 --> 00:02:57,900
I will just say this takes these few blocks cells out of code.

45
00:02:58,770 --> 00:03:00,960
Some of them do take maybe a couple minutes to run.

46
00:03:01,650 --> 00:03:02,610
Shouldn't be that long.

47
00:03:02,620 --> 00:03:05,100
The longest one might be this one, I believe.

48
00:03:05,730 --> 00:03:08,050
But you can see this is the output here.

49
00:03:08,070 --> 00:03:09,510
This is what you should be seeing.

50
00:03:10,050 --> 00:03:11,570
We have to install PI.

51
00:03:11,760 --> 00:03:12,150
No.

52
00:03:12,870 --> 00:03:15,280
Then we get to navigate to the directory.

53
00:03:15,840 --> 00:03:17,550
Then we have to download the data set.

54
00:03:17,560 --> 00:03:19,590
So we have the instructions here.

55
00:03:20,040 --> 00:03:25,920
So the format we're going to be using is the YOLO v five PI torch export format, which I've already

56
00:03:25,920 --> 00:03:26,670
done here.

57
00:03:27,000 --> 00:03:31,470
This is the well, this is the this part doesn't really work actually anymore.

58
00:03:31,740 --> 00:03:33,040
Let's just to live this.

59
00:03:33,780 --> 00:03:39,150
So I've actually uploaded the dataset to Google Drive in some cases in a sets of go down from river

60
00:03:39,150 --> 00:03:42,000
forward to change the format of their hosting.

61
00:03:42,510 --> 00:03:46,860
So we get the unzip sort of the pistols here and we unzip it.

62
00:03:47,580 --> 00:03:56,640
Then we just set up the windmill file and then we just inspect the architecture by doing cat to.

63
00:03:56,930 --> 00:04:01,170
This is displaced the architecture of the yellow freeform scaled model.

64
00:04:01,710 --> 00:04:03,390
You can see it's here.

65
00:04:03,420 --> 00:04:05,610
You can see the backbone of the anchors.

66
00:04:06,180 --> 00:04:09,060
This is a number of classes that we're retraining on here.

67
00:04:09,090 --> 00:04:09,270
Well.

68
00:04:09,510 --> 00:04:11,250
That's what it was pre-trained on that.

69
00:04:11,250 --> 00:04:15,270
Actually, we're going to be training on one class in a way that students dataset.

70
00:04:15,960 --> 00:04:17,700
And this is the head of Uber.

71
00:04:17,700 --> 00:04:23,130
For her, this is a useful tool in the CSP darknet backbone, I believe.

72
00:04:23,310 --> 00:04:29,310
And now here's where we train the other model, so you can see if we can pass a number of these input

73
00:04:29,310 --> 00:04:33,740
arguments here using the OK path style.

74
00:04:34,290 --> 00:04:40,920
So you can see we run the Python file like this, but exclamation so that we can run a Python file.

75
00:04:41,880 --> 00:04:43,740
Time actually just times it's performance.

76
00:04:43,740 --> 00:04:47,040
Pretty useful metric to look at when you're training models.

77
00:04:47,580 --> 00:04:52,560
So we set the image size, how many batches of images we're going to use per batch.

78
00:04:52,560 --> 00:04:55,130
So sixteen number of epochs.

79
00:04:55,140 --> 00:04:59,520
Then we set the directory of the Y.A. data file.

80
00:05:00,300 --> 00:05:03,930
Then we set the model, the sort of the model configuration.

81
00:05:04,620 --> 00:05:08,310
Then we said to which we're not going to use between weights, we're training from scratch.

82
00:05:08,850 --> 00:05:14,720
And then we just put two results here, and cash means that we're going to put the data in your.

83
00:05:15,810 --> 00:05:18,870
So this is a GPO will be recruiting it on.

84
00:05:20,290 --> 00:05:22,250
And now here we go.

85
00:05:22,260 --> 00:05:26,880
So it's training, so you can see it takes quite some time.

86
00:05:27,150 --> 00:05:35,130
I've been training this for maybe 15, 20 minutes now and you can see it's already reached at AT&amp;T Park

87
00:05:35,130 --> 00:05:38,520
and that is with using the GPU and club.

88
00:05:38,520 --> 00:05:44,790
And I have a cooler approach which may give me slightly faster GPU compared to what you might be using,

89
00:05:44,790 --> 00:05:46,080
although I can't say for sure.

90
00:05:46,770 --> 00:05:49,560
So you can see it's taking roughly a minute to beatbox.

91
00:05:49,560 --> 00:05:50,250
Not that bad.

92
00:05:51,660 --> 00:05:56,520
So this is what you can expect after you finish training so you can actually launch a tensor board.

93
00:05:56,970 --> 00:06:03,810
And measures to its training performance is quite useful to use if you want to make it persistent so

94
00:06:03,810 --> 00:06:05,310
that you will receive the results.

95
00:06:05,760 --> 00:06:11,400
I would encourage you to use something like weights and biases, which is a really good library and

96
00:06:11,400 --> 00:06:13,950
tool to keep storing your training results.

97
00:06:13,950 --> 00:06:17,700
It works seamlessly when you'll have five for multiple attacks.

98
00:06:18,240 --> 00:06:20,340
So I would encourage you to use that.

99
00:06:20,430 --> 00:06:24,450
This because this tensor mode is not going to remain after we exit the notebook.

100
00:06:24,460 --> 00:06:28,230
I mean, it will be there, but the actual explicit results wouldn't be there.

101
00:06:28,680 --> 00:06:32,940
And then it'll be difficult for you to compare experiments when you have two different runs.

102
00:06:33,100 --> 00:06:33,390
Yeah.

103
00:06:35,480 --> 00:06:37,460
So you can see this is another metric.

104
00:06:37,490 --> 00:06:40,630
This is from the actually all four results directory.

105
00:06:40,630 --> 00:06:45,470
You can look at the IOU, the object in this, which I'm not even sure what that is.

106
00:06:45,980 --> 00:06:48,170
You can look at precision and recall always good.

107
00:06:48,290 --> 00:06:54,410
The map scores are very good to look at as well, because these take into consideration bounding boxes

108
00:06:54,410 --> 00:06:56,120
and classification scores.

109
00:06:56,540 --> 00:07:01,910
Precision and recall again, she didn't mention, but they're always very useful to look at.

110
00:07:02,060 --> 00:07:08,810
So the four boxes on the right here are perhaps the most useful metrics to look at when treating these

111
00:07:08,810 --> 00:07:09,290
models.

112
00:07:09,920 --> 00:07:15,500
And I agree, because you can directly compare map, especially these to map scores with other models

113
00:07:15,500 --> 00:07:21,170
as well, because they tend to all use maps two point five and a map of the range point five two point

114
00:07:21,170 --> 00:07:22,550
ninety five as well.

115
00:07:23,720 --> 00:07:25,490
So let's take a look at this.

116
00:07:26,030 --> 00:07:28,490
All right, let's visualize some of our training data here.

117
00:07:28,520 --> 00:07:30,410
So this is the 22 we saw previously.

118
00:07:30,980 --> 00:07:31,400
No.

119
00:07:31,430 --> 00:07:37,670
Let's take a look at some of the actual results from our train from training on model.

120
00:07:38,180 --> 00:07:40,460
You can see it doesn't have a class theme like gun, a pistol.

121
00:07:40,460 --> 00:07:42,260
It just has class zero, which is OK.

122
00:07:42,290 --> 00:07:48,560
You can always rename those things later on and you can see it's getting the gun in almost every photo

123
00:07:48,560 --> 00:07:48,750
here.

124
00:07:48,770 --> 00:07:49,940
This is actually quite good.

125
00:07:49,940 --> 00:07:52,550
I'm actually quite impressed with this performance.

126
00:07:53,330 --> 00:07:59,600
It definitely has learnt the shape of a gun in all of these photos, and the bombing boxes are quite

127
00:07:59,600 --> 00:08:00,170
good as well.

128
00:08:00,200 --> 00:08:02,000
However, actually, this isn't the only one here.

129
00:08:02,360 --> 00:08:04,260
It didn't get these two rifles here.

130
00:08:04,400 --> 00:08:05,800
That's a bit surprising.

131
00:08:05,820 --> 00:08:11,150
Maybe doesn't have enough training data, or maybe it's just trained on pistols and not rifles.

132
00:08:11,500 --> 00:08:12,950
I actually haven't checked.

133
00:08:14,060 --> 00:08:19,810
I believe it missed part of this gun here because this is a cropped image right now, and you can see

134
00:08:19,810 --> 00:08:22,670
it doesn't get the gun here, but that's still not too bad.

135
00:08:23,450 --> 00:08:28,700
So if he wants to run an inference windows string widths, these are the words that we've trained so

136
00:08:28,700 --> 00:08:29,030
far.

137
00:08:29,030 --> 00:08:34,820
You can see it's towards the words after every epoch here, and this basically is probably the last

138
00:08:34,940 --> 00:08:35,500
we had to.

139
00:08:35,810 --> 00:08:42,690
And this is a one who probably want to use the best wits and you can see just the drunk rendered early

140
00:08:42,920 --> 00:08:43,550
detection.

141
00:08:44,000 --> 00:08:47,360
All you have to do is store some images in a test over here.

142
00:08:48,150 --> 00:08:49,220
Set these parameters.

143
00:08:49,220 --> 00:08:50,660
This is a confidence threshold.

144
00:08:51,020 --> 00:08:55,760
If you want to lose your confidence threshold like point one, you're probably going to get a lot of

145
00:08:55,760 --> 00:08:56,720
false positives.

146
00:08:56,720 --> 00:09:02,720
However, you set it too high at twenty seven point six, you're going to miss a lot of guns and hence

147
00:09:02,720 --> 00:09:03,890
have a bad recoil.

148
00:09:04,520 --> 00:09:06,560
So let's give and take point four.

149
00:09:06,920 --> 00:09:11,390
I would say 0.3 to point six is usually a good value to use.

150
00:09:11,900 --> 00:09:18,020
More often than not, I tend to use lower confidence thresholds because it depends on the application.

151
00:09:18,020 --> 00:09:23,630
But sometimes you just want to try to get as much of the detectors going, and you don't care too much

152
00:09:23,630 --> 00:09:24,740
about false positives.

153
00:09:24,740 --> 00:09:28,190
But that really does depend on your application.

154
00:09:29,390 --> 00:09:36,170
And you can see we are loading the best model with its here and we're running detect so detected as

155
00:09:36,170 --> 00:09:42,230
a python file in the area of reform model that runs detections on these images here.

156
00:09:42,680 --> 00:09:46,220
So we can see it runs on all of these images in test folder.

157
00:09:46,220 --> 00:09:50,360
Here you can see it's quite a bit of a tiff that runs pretty quick, to be honest.

158
00:09:50,660 --> 00:09:53,810
When zero, two or three seconds, that's almost real time, in my opinion.

159
00:09:53,990 --> 00:09:56,930
Almost 20 frames a second believe actually, it would be.

160
00:09:58,130 --> 00:10:03,290
And then you can see you can display some of the outputs here, so you can take a look at some of the

161
00:10:03,290 --> 00:10:04,000
images here.

162
00:10:04,090 --> 00:10:08,570
You can see it auto getting the guns correctly, which is quite good.

163
00:10:11,450 --> 00:10:15,680
The French are going to sequence pictures that don't look too difficult.

164
00:10:16,400 --> 00:10:22,130
Maybe there will be some more difficult ones lower down, hopefully because it will look a bit repetitive

165
00:10:22,130 --> 00:10:22,520
here.

166
00:10:24,770 --> 00:10:28,910
But nevertheless, we are getting the guns down great.

167
00:10:29,600 --> 00:10:37,700
So that concludes this lesson on the yellow v4 skilled object detector trained on the guns and arms

168
00:10:37,700 --> 00:10:40,440
data sets if you wanted to export your words to download.

169
00:10:41,000 --> 00:10:43,220
You can just run this line here as well.

170
00:10:44,420 --> 00:10:45,730
Always good to keep track.

171
00:10:45,740 --> 00:10:50,990
Keep track of your training day to hear your training progress, I should say so you can see what's

172
00:10:50,990 --> 00:10:54,970
going on to know if it's 17 epochs and things to look for.

173
00:10:54,980 --> 00:10:57,990
You can see the scores different metrics here.

174
00:10:58,010 --> 00:11:06,140
My point, Dave, my point five point nine five point five Precision Recall targets, classes.

175
00:11:06,660 --> 00:11:07,970
All this good stuff here.

176
00:11:08,060 --> 00:11:12,920
All the metrics actually are quite here right now, and you can measure it so you can just take a look

177
00:11:12,920 --> 00:11:15,050
and see if these metrics are getting better all the time.

178
00:11:15,530 --> 00:11:18,230
And you can see you can look at the Map Point five score.

179
00:11:18,740 --> 00:11:20,420
You can see it's two point fifty five now.

180
00:11:20,900 --> 00:11:26,510
And initially we started at point where it's a point five zero two, I guess.

181
00:11:26,660 --> 00:11:28,760
Yeah, it looks like it could be boxier.

182
00:11:29,570 --> 00:11:31,370
So that's it for this lesson.

183
00:11:31,370 --> 00:11:32,270
I hope you enjoyed it.

184
00:11:32,690 --> 00:11:39,170
In the next lesson, we'll take a look at mass detection using TensorFlow Object Detection Library.

185
00:11:39,800 --> 00:11:46,400
So my selection meeting the Alison King machines initially in my head but is actually to mass like those

186
00:11:46,760 --> 00:11:48,830
face masks to wear because of the pandemic.

187
00:11:49,430 --> 00:11:53,120
Well, we're going to do a mass detector next, so stay tuned for that.

188
00:11:53,300 --> 00:11:53,720
Thank you.