1
00:00:00,000 --> 00:00:03,060
Image augmentation
and data augmentation

2
00:00:03,060 --> 00:00:04,950
is one of the most
widely used tools in

3
00:00:04,950 --> 00:00:06,720
deep learning to increase

4
00:00:06,720 --> 00:00:08,010
your dataset size and make

5
00:00:08,010 --> 00:00:09,855
your neural networks
perform better.

6
00:00:09,855 --> 00:00:11,970
In this week, you'll
learn how the use of

7
00:00:11,970 --> 00:00:15,780
the easy-to-use tools in
TensorFlow to implement this.

8
00:00:15,780 --> 00:00:17,130
Yes. So like last week when we

9
00:00:17,130 --> 00:00:18,420
looked at cats versus dogs,

10
00:00:18,420 --> 00:00:20,160
we had 25,000 images.

11
00:00:20,160 --> 00:00:21,540
That was a nice big dataset,

12
00:00:21,540 --> 00:00:23,280
but we don't always
have access to that.

13
00:00:23,280 --> 00:00:26,790
In fact, sometimes, 25,000
images isn't enough.

14
00:00:26,790 --> 00:00:29,370
Exactly. Some of
the nice things with

15
00:00:29,370 --> 00:00:32,400
being able to do image
augmentation is that we can then,

16
00:00:32,400 --> 00:00:34,710
I think you just use
the term create new data,

17
00:00:34,710 --> 00:00:36,405
which is effectively
what we're doing.

18
00:00:36,405 --> 00:00:39,320
So for example, if we have
a cat and our cats in

19
00:00:39,320 --> 00:00:40,580
our training dataset are

20
00:00:40,580 --> 00:00:42,280
always upright and
their ears are like this,

21
00:00:42,280 --> 00:00:44,795
we may not spot a cat
that's lying down.

22
00:00:44,795 --> 00:00:47,330
But with augmentation, being
able to rotate the image,

23
00:00:47,330 --> 00:00:48,590
or being able to skew the image,

24
00:00:48,590 --> 00:00:50,810
or maybe some other
transforms would be able

25
00:00:50,810 --> 00:00:53,140
to effectively generate
that data to train off.

26
00:00:53,140 --> 00:00:54,240
So you skew the image and

27
00:00:54,240 --> 00:00:55,905
just toss that into
the training set.

28
00:00:55,905 --> 00:00:58,250
But there's an important trick
to how you do this in

29
00:00:58,250 --> 00:01:00,755
TensorFlow as well to
not take an image,

30
00:01:00,755 --> 00:01:02,075
warp it, skew it,

31
00:01:02,075 --> 00:01:04,160
and then blow up the
memory requirements.

32
00:01:04,160 --> 00:01:06,290
So TensorFlow makes it
really easy to do this.

33
00:01:06,290 --> 00:01:07,790
Yes. So you will
learn a lot about

34
00:01:07,790 --> 00:01:10,355
the image generator and
the image data generator,

35
00:01:10,355 --> 00:01:12,350
where the idea is that
you're not going to

36
00:01:12,350 --> 00:01:14,525
edit the images
directly on the drive.

37
00:01:14,525 --> 00:01:16,540
As they get float
off the directory,

38
00:01:16,540 --> 00:01:18,470
then the augmentation
will take place in

39
00:01:18,470 --> 00:01:19,640
memory as they're being loaded

40
00:01:19,640 --> 00:01:21,230
into the neural network
for training.

41
00:01:21,230 --> 00:01:23,420
So if you're dealing with
a dataset and you want

42
00:01:23,420 --> 00:01:25,520
to experiment with
different augmentations,

43
00:01:25,520 --> 00:01:26,905
you're not overriding the data.

44
00:01:26,905 --> 00:01:29,000
So [inaudible] to
generate a library lets

45
00:01:29,000 --> 00:01:31,715
you load it into memory
and just in memory,

46
00:01:31,715 --> 00:01:34,190
process the images and
then stream that to

47
00:01:34,190 --> 00:01:35,810
the training set to
the neural network

48
00:01:35,810 --> 00:01:37,295
we'll ultimately learn on.

49
00:01:37,295 --> 00:01:39,350
This is one of
the most important tricks

50
00:01:39,350 --> 00:01:41,585
that the deep learning
[inaudible] realizes,

51
00:01:41,585 --> 00:01:43,220
really the preferred way

52
00:01:43,220 --> 00:01:45,455
these days to do
image augmentation.

53
00:01:45,455 --> 00:01:47,030
Yeah and I think it's,

54
00:01:47,030 --> 00:01:48,560
for the main reason that it's

55
00:01:48,560 --> 00:01:50,090
not impacting your data, right,

56
00:01:50,090 --> 00:01:52,490
you're not overriding your data
because you may need to

57
00:01:52,490 --> 00:01:55,400
experiment with that data again
and those kind of things.

58
00:01:55,400 --> 00:01:56,830
It's also nice and fast.

59
00:01:56,830 --> 00:01:58,620
It doesn't blow up
your memory requirements.

60
00:01:58,620 --> 00:01:59,630
You can take one image and

61
00:01:59,630 --> 00:02:01,460
create a lot of
other images from it,

62
00:02:01,460 --> 00:02:02,480
but you don't want to save all

63
00:02:02,480 --> 00:02:04,430
those other images onto this.

64
00:02:04,430 --> 00:02:06,140
Remember, we had a conversation

65
00:02:06,140 --> 00:02:07,520
recently about the lack of,

66
00:02:07,520 --> 00:02:09,530
there's a lot of literature on

67
00:02:09,530 --> 00:02:12,170
this topic so there's
opportunity to learn.

68
00:02:12,170 --> 00:02:14,915
Yeah. One of these thinkings
about data augmentation and

69
00:02:14,915 --> 00:02:17,600
image augmentation is
so many people do it,

70
00:02:17,600 --> 00:02:19,160
it's such an important part

71
00:02:19,160 --> 00:02:20,805
of how we train neural networks.

72
00:02:20,805 --> 00:02:23,795
At least today, the academic
literature on it is

73
00:02:23,795 --> 00:02:26,810
thinner relative to
what one might guess,

74
00:02:26,810 --> 00:02:28,670
given this importance,
but this is

75
00:02:28,670 --> 00:02:31,580
definitely one of the techniques
you should learn.

76
00:02:31,580 --> 00:02:34,670
So please dive into
this week's materials

77
00:02:34,670 --> 00:02:38,760
to learn about image augmentation
and data augmentation.