1
00:00:00,000 --> 00:00:06,780
The fashion and this data set was
created by [inaudible] and [inaudible].

2
00:00:06,780 --> 00:00:10,215
I think it's really cool
that you're already able to

3
00:00:10,215 --> 00:00:12,030
implement a neural network to do

4
00:00:12,030 --> 00:00:14,250
this fashion classification task.

5
00:00:14,250 --> 00:00:15,660
It's just amazing that

6
00:00:15,660 --> 00:00:17,610
large data sets like
this are readily

7
00:00:17,610 --> 00:00:19,230
available to students
so that they can

8
00:00:19,230 --> 00:00:21,710
learn and it make it
really easy to learn.

9
00:00:21,710 --> 00:00:24,090
And in this case we saw with
just a few lines of code,

10
00:00:24,090 --> 00:00:26,310
we were able to build a DNN that

11
00:00:26,310 --> 00:00:28,620
allowed you to do
this classification of clothing

12
00:00:28,620 --> 00:00:30,840
and we got reasonable
accuracy with it but it

13
00:00:30,840 --> 00:00:31,980
was a little bit of a naive

14
00:00:31,980 --> 00:00:33,705
algorithm that we used, right?

15
00:00:33,705 --> 00:00:36,150
We're looking at every
pixel in every image,

16
00:00:36,150 --> 00:00:37,470
but maybe there's ways
that we can make it

17
00:00:37,470 --> 00:00:39,030
better but maybe
looking at features of

18
00:00:39,030 --> 00:00:40,500
what makes a shoe a
shoe and what makes

19
00:00:40,500 --> 00:00:42,465
a handbag a handbag.
What do you think?

20
00:00:42,465 --> 00:00:44,525
Yeah. So one of
the ideas that make

21
00:00:44,525 --> 00:00:46,820
these neural networks
work much better is to

22
00:00:46,820 --> 00:00:49,100
use convolutional
neural networks,

23
00:00:49,100 --> 00:00:51,740
where instead of looking at
every single pixel and say,

24
00:00:51,740 --> 00:00:53,900
"Oh, that pixel has value 87,

25
00:00:53,900 --> 00:00:55,520
that has value 127."

26
00:00:55,520 --> 00:00:57,830
So is this a shoe or
is this a hand bag?

27
00:00:57,830 --> 00:00:58,625
I don't know.

28
00:00:58,625 --> 00:01:00,170
But instead you can look
at a picture and say,

29
00:01:00,170 --> 00:01:02,030
"Oh, I see shoelaces and a sole."

30
00:01:02,030 --> 00:01:03,410
Then, it's probably shoe or say,

31
00:01:03,410 --> 00:01:07,340
"I see a handle and
rectangular bag beneath that."

32
00:01:07,340 --> 00:01:10,040
Probably a handbag. So
confidence hopefully,

33
00:01:10,040 --> 00:01:11,855
we'll let the students do this.

34
00:01:11,855 --> 00:01:14,150
Sure, what's really
interesting about

35
00:01:14,150 --> 00:01:15,380
convolutions is they sound

36
00:01:15,380 --> 00:01:17,180
complicated but they're
actually quite straightforward,

37
00:01:17,180 --> 00:01:17,390
right?

38
00:01:17,390 --> 00:01:19,250
It's a filter that you pass over

39
00:01:19,250 --> 00:01:21,320
an image in the same way as
if you're doing sharpening,

40
00:01:21,320 --> 00:01:22,700
if you've ever done
image processing.

41
00:01:22,700 --> 00:01:24,170
It can spot features

42
00:01:24,170 --> 00:01:26,065
within the image as
you've mentioned.

43
00:01:26,065 --> 00:01:29,620
With the same paradigm
of just data labels,

44
00:01:29,620 --> 00:01:30,950
we can let a neural network

45
00:01:30,950 --> 00:01:32,955
figure out for itself
that it should look for

46
00:01:32,955 --> 00:01:35,420
shoe laces and soles
or handles in bags and

47
00:01:35,420 --> 00:01:38,020
just learn how to detect
these things by itself.

48
00:01:38,020 --> 00:01:39,380
So shall we see what impact that

49
00:01:39,380 --> 00:01:40,730
would have on Fashion MNIST?

50
00:01:40,730 --> 00:01:43,580
So in the next video,
you'll learn about

51
00:01:43,580 --> 00:01:46,010
convolutional neural networks
and get to use it

52
00:01:46,010 --> 00:01:49,560
to build a much better
fashion classifier.