1
00:00:00,180 --> 00:00:06,180
So now that we have completed your introduction to regularization and you've seen how well it works

2
00:00:06,180 --> 00:00:12,540
in reducing overfitting and improving generalization, let's take a look at some guidelines on when

3
00:00:12,540 --> 00:00:14,430
and how to use regularization.

4
00:00:14,610 --> 00:00:15,690
So let's get started.

5
00:00:16,530 --> 00:00:22,890
So firstly, it's always a good practice to train without regularization for this, and I'll tell you

6
00:00:22,890 --> 00:00:23,250
why.

7
00:00:23,850 --> 00:00:29,460
Firstly, sometimes it regularization techniques or combinations of them can have detrimental effects

8
00:00:29,460 --> 00:00:30,630
on a model of performance.

9
00:00:30,990 --> 00:00:35,610
Example, if you use some bad parameter settings, you can actually make the model performance worse

10
00:00:35,610 --> 00:00:36,510
and the test data set.

11
00:00:36,990 --> 00:00:43,920
So generally, it's always, always good to have a baseline model and then, but we introduce different

12
00:00:43,920 --> 00:00:46,500
regularization techniques to assess the impact.

13
00:00:47,250 --> 00:00:51,810
Also, it's never a good idea to start with drop out in that storm when you're trying to experiment

14
00:00:51,810 --> 00:00:54,390
because this increases the convergence time.

15
00:00:54,390 --> 00:00:59,100
So it's going to basically slow down your own training experimentation process.

16
00:01:00,330 --> 00:01:04,710
So here are some tips and warnings when using regularization methods.

17
00:01:05,240 --> 00:01:05,810
Dropout.

18
00:01:05,970 --> 00:01:09,910
Firstly, don't use it before the final soft max layer, you can use it.

19
00:01:10,150 --> 00:01:17,100
The other points where I showed you previously, which is after the Congolese or the max pool, is so

20
00:01:17,730 --> 00:01:22,770
when training with El to drop our data augmentation, you need more ebooks to achieve the same performance

21
00:01:22,770 --> 00:01:23,280
generally.

22
00:01:23,700 --> 00:01:28,890
So you always have to add more ebooks when adding on to these when introducing these methods.

23
00:01:29,490 --> 00:01:35,220
Now here's a note that I've mentioned before that things like data augmentation, dropout, batch gnome,

24
00:01:35,670 --> 00:01:36,730
they're quite slow.

25
00:01:36,750 --> 00:01:41,940
They add extra computational processing during the training process, and as such, they will slow it

26
00:01:41,940 --> 00:01:42,210
down.

27
00:01:42,480 --> 00:01:45,090
So it's a double double impact anyway.

28
00:01:45,900 --> 00:01:50,250
Now, unbefitting is possible, too, as well if your L2 weights are set too high.

29
00:01:50,460 --> 00:01:51,400
That's not a concern.

30
00:01:51,420 --> 00:01:53,760
So be careful with how you set these parameters.

31
00:01:54,240 --> 00:02:01,440
We'll take a look at some using some of these in in the next section where we start using regularization

32
00:02:01,440 --> 00:02:03,660
techniques and keras and PyTorch.

33
00:02:04,170 --> 00:02:05,400
So stay tuned for that.

34
00:02:05,820 --> 00:02:11,970
So in the next lesson, we'll take a look at actually implementing some of these regularization techniques

35
00:02:11,970 --> 00:02:12,660
in court.

36
00:02:12,930 --> 00:02:14,430
So I'll see you in the next lesson.

37
00:02:14,610 --> 00:02:15,090
Thank you.