1
00:00:11,600 --> 00:00:16,580
In this section of the course, we are going to introduce the idea of artificial neural networks or

2
00:00:16,580 --> 00:00:17,660
angels for short.

3
00:00:18,500 --> 00:00:23,690
Specifically, we're going to talk about a certain kind of artificial neural network called a feedforward

4
00:00:23,690 --> 00:00:24,440
neural network.

5
00:00:25,010 --> 00:00:27,190
This is the most basic kind of neural network.

6
00:00:27,200 --> 00:00:33,110
But as you'll see, the concepts involved go quite deep, and they also form the basis for other kinds

7
00:00:33,110 --> 00:00:37,670
of neural networks, such as convolutional neural networks and recurrent neural networks.

8
00:00:42,860 --> 00:00:48,110
To begin this discussion, let's start with why and how neural networks came to be in the first place.

9
00:00:49,650 --> 00:00:53,940
This is something that a lot of teachers skip over, but it's something that I find quite interesting

10
00:00:53,940 --> 00:00:54,990
and inspirational.

11
00:00:55,500 --> 00:01:00,750
If you think neural networks are a cool new model to help you pick stocks or play Mario Kart, then

12
00:01:00,750 --> 00:01:02,640
you are still looking for the small fish.

13
00:01:03,030 --> 00:01:05,790
In fact, neural networks are way more interesting than that.

14
00:01:10,870 --> 00:01:15,970
As you may have realized that the name artificial neural network means that we are trying to artificially

15
00:01:15,970 --> 00:01:18,250
create our neural network in a computer.

16
00:01:18,970 --> 00:01:23,290
OK, so then what's a real neural network as in a non artificial neural network?

17
00:01:24,010 --> 00:01:29,110
The name neural network derived from neurons, which are the cells in your brain and extend throughout

18
00:01:29,110 --> 00:01:30,010
your nervous system.

19
00:01:30,730 --> 00:01:34,510
Now, this is probably too obvious for most of you, but let's state it anyway.

20
00:01:34,510 --> 00:01:40,990
Just in case your brain is what you used to think, neurons in your brain are connected to each other

21
00:01:40,990 --> 00:01:45,100
and can communicate with each other via electrical and chemical signals.

22
00:01:46,510 --> 00:01:53,140
Amazingly, this simple physical and chemical system is what makes you you all of your thoughts and

23
00:01:53,140 --> 00:01:57,250
aspirations, your emotions and every action you take throughout the day.

24
00:01:57,580 --> 00:02:03,910
It's all driven by your neurons, which in turn are just sending electrical and chemical signals around

25
00:02:03,910 --> 00:02:05,050
amongst themselves.

26
00:02:10,250 --> 00:02:15,530
Once scientists realized what the brain was doing and what it was responsible for, the next question

27
00:02:15,530 --> 00:02:19,190
was almost obvious in hindsight, can we build a brain?

28
00:02:19,880 --> 00:02:25,970
I mean, if the brain is just a network of neurons and we can simulate neurons in a computer if we connect

29
00:02:25,970 --> 00:02:31,280
a bunch of neurons through a computer simulation, it seems that it might be possible to create some

30
00:02:31,280 --> 00:02:35,080
form of intelligence and artificial intelligence, you might say.

31
00:02:40,190 --> 00:02:43,670
So let's take our model of a single neuron logistic regression.

32
00:02:44,510 --> 00:02:49,910
Now let's imagine that we have multiple neurons all taking in the same inputs, but computing something

33
00:02:49,910 --> 00:02:50,360
different.

34
00:02:51,170 --> 00:02:52,550
Now let's do it again.

35
00:02:53,180 --> 00:02:55,640
Now we have multiple logistic regression.

36
00:02:56,840 --> 00:03:00,650
Now let's imagine that all these neurons are connected to more neurons.

37
00:03:00,920 --> 00:03:06,530
So we just repeat the process, pretending that the new layer of neurons are actually inputs to motor

38
00:03:06,530 --> 00:03:07,160
neurons.

39
00:03:07,820 --> 00:03:12,470
That's basically a very tiny model of the brain neurons connected to neurons.

40
00:03:13,160 --> 00:03:15,500
Of course, this is necessarily simplistic.

41
00:03:16,100 --> 00:03:18,710
One side is the input and one side is the output.

42
00:03:19,400 --> 00:03:22,140
Of course, the actual brain is much more complex.

43
00:03:22,190 --> 00:03:25,850
There are many inputs and many outputs in the middle.

44
00:03:25,850 --> 00:03:31,340
Wires can crisscross if we have a later neuron connecting back to an earlier neuron.

45
00:03:31,580 --> 00:03:33,290
We call that a recurrent connection.

46
00:03:34,010 --> 00:03:39,050
The neural networks we are about to discuss in this section contain no such complexities.

47
00:03:39,800 --> 00:03:45,410
Instead, because the input is on one side and the output is on the other side and we go from input

48
00:03:45,410 --> 00:03:50,150
to output in a layaways fashion, we call this a feedforward neural network.

49
00:03:55,350 --> 00:04:00,210
In the rest of this lecture, we are going to outline what we will discuss in this section of the course.

50
00:04:00,930 --> 00:04:04,860
First, we are going to start out again by discussing the model architecture.

51
00:04:05,460 --> 00:04:10,110
As you know, the model will be discussing in this section is the feedforward neural network.

52
00:04:10,860 --> 00:04:16,140
The next step, after discussing the model architecture, will be to go back to the geometric picture.

53
00:04:16,740 --> 00:04:18,519
If you recall, my model goes.

54
00:04:18,540 --> 00:04:21,360
Machine learning is nothing but a geometry problem.

55
00:04:22,019 --> 00:04:28,440
So how do neural networks extend the capabilities of a basic linear model in terms of solving this geometry

56
00:04:28,440 --> 00:04:28,890
problem?

57
00:04:29,940 --> 00:04:33,870
Next, we're going to go more in-depth and discuss activation functions.

58
00:04:35,280 --> 00:04:38,400
Activation functions are very important in neural networks.

59
00:04:38,490 --> 00:04:43,890
They are what make neural networks more expressive than the simple, linear models you saw in the previous

60
00:04:43,890 --> 00:04:44,370
section.

61
00:04:45,540 --> 00:04:51,060
After that, we're going to discuss how to do multiclass classification using neural networks.

62
00:04:51,720 --> 00:04:55,980
If you recall in the previous section, we only discussed the binary classification.

63
00:04:56,550 --> 00:05:02,760
This works if we only have two classes dog or cat fraud, or no fraud, purchase or leave the store,

64
00:05:02,760 --> 00:05:03,450
and so on.

65
00:05:04,050 --> 00:05:05,730
But what if we have more classes?

66
00:05:06,240 --> 00:05:11,640
For example, we might be working on a self-driving car that needs to be able to recognize multiple

67
00:05:11,640 --> 00:05:13,170
kinds of objects on the road.

68
00:05:13,890 --> 00:05:19,440
In this case, a binary classification is not good enough, and we need multiclass classification.

69
00:05:24,190 --> 00:05:27,640
Now you'll notice that most of what we just discussed was our theory.

70
00:05:28,270 --> 00:05:32,410
This is basically optional if you don't care about how neural networks actually work.

71
00:05:33,130 --> 00:05:38,620
After this, we'll move back to the practical world and look at how to do multiclass classification

72
00:05:38,740 --> 00:05:39,730
in TensorFlow.

73
00:05:40,450 --> 00:05:45,580
You'll see that it's pretty similar to the previous section, with just a few small changes in syntax.

74
00:05:46,330 --> 00:05:48,490
But the next part of this section is critical.

75
00:05:49,180 --> 00:05:54,580
As you recall, I said that at some point we would stop using TFR, TAF and start using embeddings,

76
00:05:55,000 --> 00:05:57,100
which are more common in the field of deep learning.

77
00:05:57,940 --> 00:06:02,320
We'll look at what an embedding actually is and how we can use them in TensorFlow.

78
00:06:03,430 --> 00:06:08,500
Unfortunately, we won't really get to use embeddings since they only come into play with CNN's and

79
00:06:08,710 --> 00:06:10,990
ends which come later in the course.

80
00:06:11,800 --> 00:06:15,760
However, we will take this opportunity to do an advanced exercise.

81
00:06:16,390 --> 00:06:22,480
In particular, you'll recall that Words Avec is a technique which has been used to retrain word embeddings.

82
00:06:23,110 --> 00:06:28,750
As you recall, words effect is basically a neural network, which you now understand having been through

83
00:06:28,750 --> 00:06:29,980
this section of the course.

84
00:06:30,580 --> 00:06:34,840
Therefore, you have everything you need in order to implement words of EC.

85
00:06:35,530 --> 00:06:40,780
Now, that doesn't mean it will be easy since, as you'll see, there are some complications that arise,

86
00:06:41,710 --> 00:06:43,630
but it doesn't require new knowledge.

87
00:06:43,930 --> 00:06:46,780
Just more advanced application of existing knowledge.

88
00:06:47,380 --> 00:06:52,900
As such, this exercise will be designated as advanced, and it will be a nice way to cap off what you've

89
00:06:52,900 --> 00:06:54,190
learned about intense.