﻿1
00:00:02,840 --> 00:00:05,790
‫Now, let us look at another CNN architecture.

2
00:00:06,510 --> 00:00:09,680
‫GoogleNet was the winner of 2014 challenge.

3
00:00:10,550 --> 00:00:18,620
‫GoogleNet had one new concept and the concept was of an Inception module. Inception module

4
00:00:18,770 --> 00:00:20,000
‫Looks something like this.

5
00:00:22,370 --> 00:00:27,200
‫That input to the Inception module is given to four different layers.

6
00:00:27,860 --> 00:00:31,450
‫Three of these layers are convolutional layers.

7
00:00:31,940 --> 00:00:33,500
‫And the fourth one is a max pool layer.

8
00:00:35,750 --> 00:00:40,420
‫If you look at these convolutional layers, these are also one by one kernel.

9
00:00:41,120 --> 00:00:44,210
‫That is the window is of the size of a single pixel.

10
00:00:45,140 --> 00:00:50,210
‫Usually we have been using convolutional layers with two by two or three by two window.

11
00:00:50,990 --> 00:00:56,570
‫But in the Inception module, you can see that windows or size one by one are used.

12
00:00:57,530 --> 00:00:59,490
‫This one as represents the stride.

13
00:00:59,660 --> 00:01:04,240
‫So it has a strode of one, the output of these two convolutional layers.

14
00:01:04,850 --> 00:01:09,080
‫And this max pooling layer then went into three different convolutional layers.

15
00:01:10,100 --> 00:01:15,340
‫The output of these four was then put into a depth concatinator.

16
00:01:16,250 --> 00:01:18,500
‫Will not be discussing that depth concatinator here.

17
00:01:19,880 --> 00:01:22,730
‫This whole thing is called an inception module.

18
00:01:24,380 --> 00:01:32,660
‫And the actual architecture of GoogleNet was something like this input of images right through here.

19
00:01:32,810 --> 00:01:39,500
‫Then there was a convolutional layer, a max pool layer, a local response normal layer, two convolutional layers

20
00:01:39,500 --> 00:01:40,070
‫and so on.

21
00:01:40,850 --> 00:01:44,090
‫All of these are stacked inception modules.

22
00:01:44,330 --> 00:01:45,980
‫So this is the Inception layer.

23
00:01:46,100 --> 00:01:47,420
‫This is Inception layer.

24
00:01:48,530 --> 00:01:55,010
‫So many of these inception layers, the output of these goes into another set of inception layers.

25
00:01:55,490 --> 00:01:58,840
‫And then we finally have a fully connected neural network.

26
00:02:01,740 --> 00:02:10,380
‫So if we look at it, it is a very complex network, although it had very few training parameters for

27
00:02:10,380 --> 00:02:16,140
‫most of the working professionals or student working in the field of data science and machine learning,

28
00:02:16,590 --> 00:02:21,660
‫creating such architectures on their machines is not possible.

29
00:02:22,650 --> 00:02:31,620
‫So one of the good things that is coming with our Guide US library is that we are able to use these

30
00:02:31,630 --> 00:02:35,440
‫pre trained models for our problem.

31
00:02:38,760 --> 00:02:43,350
‫These architectures were created to solve one particular problem.

32
00:02:44,300 --> 00:02:49,360
‫But Tenent allow us to use these architectures for other problems.

33
00:02:49,380 --> 00:02:51,660
‫Also, let us see how.

