1
00:00:00,000 --> 00:00:02,430
In the previous video,
you saw how to build

2
00:00:02,430 --> 00:00:04,120
a convolutional
neural network that

3
00:00:04,120 --> 00:00:06,240
classified horses against humans.

4
00:00:06,240 --> 00:00:08,130
When you are done, you then did

5
00:00:08,130 --> 00:00:09,450
a few tests using images

6
00:00:09,450 --> 00:00:10,950
that you downloaded from the web.

7
00:00:10,950 --> 00:00:12,810
In this video, you'll see how you

8
00:00:12,810 --> 00:00:14,400
can build validation into

9
00:00:14,400 --> 00:00:15,750
the training loop by

10
00:00:15,750 --> 00:00:18,210
specifying a set of
validation images,

11
00:00:18,210 --> 00:00:19,620
and then have TensorFlow do

12
00:00:19,620 --> 00:00:21,060
the heavy lifting of measuring

13
00:00:21,060 --> 00:00:23,085
its effectiveness with that same.

14
00:00:23,085 --> 00:00:25,619
As before we download
the dataset,

15
00:00:25,619 --> 00:00:27,150
but now will also download

16
00:00:27,150 --> 00:00:29,265
the separate validation dataset.

17
00:00:29,265 --> 00:00:31,440
We'll unzip into
two separate folders,

18
00:00:31,440 --> 00:00:33,675
one for training,
one for validation.

19
00:00:33,675 --> 00:00:35,880
We'll create some
variables that pointed

20
00:00:35,880 --> 00:00:38,195
our training and
validation subdirectories,

21
00:00:38,195 --> 00:00:40,220
and we can check
out the filenames.

22
00:00:40,220 --> 00:00:42,140
Remember that
the filenames may not

23
00:00:42,140 --> 00:00:44,045
always be reliable for labels.

24
00:00:44,045 --> 00:00:46,730
For example, here
the validation horse labels

25
00:00:46,730 --> 00:00:49,550
aren't named as such
while the human ones are.

26
00:00:49,550 --> 00:00:51,380
We can also do a quick check

27
00:00:51,380 --> 00:00:53,140
on whether we got all the data,

28
00:00:53,140 --> 00:00:56,365
and it looks good so we
think we can proceed.

29
00:00:56,365 --> 00:00:58,680
We can display some of
the training images

30
00:00:58,680 --> 00:00:59,735
as we did before,

31
00:00:59,735 --> 00:01:02,450
and let's just go
straight to our model.

32
00:01:02,450 --> 00:01:04,700
Here we can import TensorFlow,

33
00:01:04,700 --> 00:01:07,220
and here we define
the layers in our model.

34
00:01:07,220 --> 00:01:09,245
It's exactly the
same as last time.

35
00:01:09,245 --> 00:01:11,510
We'll then print
the summary of our model,

36
00:01:11,510 --> 00:01:13,730
and you can see that it
hasn't changed either.

37
00:01:13,730 --> 00:01:15,650
Then we'll compile the model with

38
00:01:15,650 --> 00:01:17,890
the same parameters as before.

39
00:01:17,890 --> 00:01:20,600
Now, here's where we
can make some changes.

40
00:01:20,600 --> 00:01:23,629
As well as an image generator
for the training data,

41
00:01:23,629 --> 00:01:26,615
we now create a second one
for the validation data.

42
00:01:26,615 --> 00:01:28,505
It's pretty much the same flow.

43
00:01:28,505 --> 00:01:30,665
We create a validation generator

44
00:01:30,665 --> 00:01:32,770
as an instance of
image generator,

45
00:01:32,770 --> 00:01:34,765
re-scale it to normalize,

46
00:01:34,765 --> 00:01:37,505
and then pointed at
the validation directory.

47
00:01:37,505 --> 00:01:39,740
When we run it, we
see that it picks up

48
00:01:39,740 --> 00:01:42,950
the images and the classes
from that directory.

49
00:01:42,950 --> 00:01:45,185
So now let's train the network.

50
00:01:45,185 --> 00:01:47,000
Note the extra parameters to

51
00:01:47,000 --> 00:01:49,205
let it know about
the validation data.

52
00:01:49,205 --> 00:01:52,250
Now, at the end of
every epoch as well as

53
00:01:52,250 --> 00:01:55,010
reporting the loss and
accuracy on the training,

54
00:01:55,010 --> 00:01:56,990
it also checks the validation

55
00:01:56,990 --> 00:01:59,710
set to give us loss
in accuracy there.

56
00:01:59,710 --> 00:02:01,570
As the epochs progress,

57
00:02:01,570 --> 00:02:04,460
you should see them
steadily increasing with

58
00:02:04,460 --> 00:02:06,170
the validation accuracy being

59
00:02:06,170 --> 00:02:08,225
slightly less than the training.

60
00:02:08,225 --> 00:02:11,540
It should just take about
another two minutes.

61
00:02:11,540 --> 00:02:15,125
Okay. Now that we've
reached epoch 15,

62
00:02:15,125 --> 00:02:16,970
we can see that our accuracy is

63
00:02:16,970 --> 00:02:19,475
about 97 percent on
the training data,

64
00:02:19,475 --> 00:02:22,595
and about 85 percent
on the validation set,

65
00:02:22,595 --> 00:02:24,275
and this is as expected.

66
00:02:24,275 --> 00:02:26,270
The validation set is data that

67
00:02:26,270 --> 00:02:29,030
the neural network
hasn't previously seen,

68
00:02:29,030 --> 00:02:32,300
so you would expect it to
perform a little worse on it.

69
00:02:32,300 --> 00:02:33,860
But let's try some more images

70
00:02:33,860 --> 00:02:36,330
starting with this white horse.

71
00:02:42,500 --> 00:02:45,935
We can see that it was
misclassified as a human.

72
00:02:45,935 --> 00:02:48,840
Okay, let's try this
really cute one.

73
00:02:54,770 --> 00:02:58,535
We can see that's correctly
classified as a horse.

74
00:02:58,535 --> 00:03:01,050
Okay, let's try some people.

75
00:03:04,120 --> 00:03:07,115
Let's try this woman
in a blue dress.

76
00:03:07,115 --> 00:03:08,810
This is really
interesting picture

77
00:03:08,810 --> 00:03:10,820
because she has her back turned,

78
00:03:10,820 --> 00:03:12,965
and her legs are
obscured by the dress,

79
00:03:12,965 --> 00:03:16,480
but she's correctly
classified as a human.

80
00:03:16,480 --> 00:03:18,495
Okay, here's a tricky one.

81
00:03:18,495 --> 00:03:20,590
To our eyes she's human,

82
00:03:20,590 --> 00:03:24,040
but will the wings confuse
the neural network?

83
00:03:28,490 --> 00:03:31,905
And they do, she is
mistaken for a horse.

84
00:03:31,905 --> 00:03:34,220
It's understandable
though particularly as

85
00:03:34,220 --> 00:03:35,780
the training set has a lot of

86
00:03:35,780 --> 00:03:38,760
white horses against
the grassy background.

87
00:03:40,850 --> 00:03:43,250
How about this one? It has

88
00:03:43,250 --> 00:03:46,140
both a horse and the human in it,

89
00:03:48,800 --> 00:03:51,720
but it gets classified
as a horse.

90
00:03:51,720 --> 00:03:53,540
We can see the dominant features

91
00:03:53,540 --> 00:03:54,740
in the image are the horse,

92
00:03:54,740 --> 00:03:56,540
so it's not really surprising.

93
00:03:56,540 --> 00:03:59,390
Also there are many white horses
in the training set,

94
00:03:59,390 --> 00:04:01,340
so it might be matching on them.

95
00:04:01,340 --> 00:04:06,650
Okay one last one. I couldn't

96
00:04:06,650 --> 00:04:09,660
resist this image as
it's so adorable,

97
00:04:11,590 --> 00:04:15,390
and thankfully it's
classified as a horse.

98
00:04:16,310 --> 00:04:20,210
So, now we saw the training
with a validation set,

99
00:04:20,210 --> 00:04:22,580
and we could get a good estimate
for the accuracy of

100
00:04:22,580 --> 00:04:24,380
the classifier by looking at

101
00:04:24,380 --> 00:04:26,450
the results with
a validation set.

102
00:04:26,450 --> 00:04:29,090
Using these results and
understanding where

103
00:04:29,090 --> 00:04:31,460
and why some inferences fail,

104
00:04:31,460 --> 00:04:33,680
can help you understand
how to modify

105
00:04:33,680 --> 00:04:36,535
your training data to
prevent errors like that.

106
00:04:36,535 --> 00:04:38,730
But let's switch gears
in the next video.

107
00:04:38,730 --> 00:04:40,220
We'll take a look
at the impact of

108
00:04:40,220 --> 00:04:43,590
compacting your data to
make training quicker.