WEBVTT

1
00:00:00.000 --> 00:00:02.880
<v ->Hey, I'm gonna walk you through what is ChatGPT.</v>

2
00:00:02.880 --> 00:00:06.690
This is the original AI model that everyone knows and loves.

3
00:00:06.690 --> 00:00:07.770
So you've probably heard of it,

4
00:00:07.770 --> 00:00:09.480
but we're gonna walk you through it anyway.

5
00:00:09.480 --> 00:00:11.910
What would be the next word in this sentence?

6
00:00:11.910 --> 00:00:15.090
The students open their what?

7
00:00:15.090 --> 00:00:16.110
It could be books.

8
00:00:16.110 --> 00:00:17.370
They could open their laptops,

9
00:00:17.370 --> 00:00:19.350
they could open their exams, their minds.

10
00:00:19.350 --> 00:00:20.610
There's a lot of different words

11
00:00:20.610 --> 00:00:23.430
that could come next in that sentence.

12
00:00:23.430 --> 00:00:26.250
And we can predict what word might come next

13
00:00:26.250 --> 00:00:29.670
based on the context in which we're seeing the sentence,

14
00:00:29.670 --> 00:00:31.560
and so can ChatGPT,

15
00:00:31.560 --> 00:00:34.290
and that is essentially how these models work.

16
00:00:34.290 --> 00:00:36.630
They have figured out the probability

17
00:00:36.630 --> 00:00:40.440
of what word comes next based on all of the different words

18
00:00:40.440 --> 00:00:42.360
that they've seen on the internet.

19
00:00:42.360 --> 00:00:44.850
And you know, without getting too technical,

20
00:00:44.850 --> 00:00:47.550
they have those probabilities baked into the model

21
00:00:47.550 --> 00:00:50.040
and they can decide based on the context,

22
00:00:50.040 --> 00:00:51.960
based on the prompt that you give them,

23
00:00:51.960 --> 00:00:53.250
what word comes next.

24
00:00:53.250 --> 00:00:56.070
And they specifically use tokens rather than words.

25
00:00:56.070 --> 00:00:59.130
A token is about three fourths of a word on average.

26
00:00:59.130 --> 00:01:01.350
Some differences in how they split those words up.

27
00:01:01.350 --> 00:01:03.090
But the model is trained

28
00:01:03.090 --> 00:01:05.310
until it can predict the next token.

29
00:01:05.310 --> 00:01:08.790
And then when it knows on average what the right tokens are

30
00:01:08.790 --> 00:01:10.980
to come next in the sequence,

31
00:01:10.980 --> 00:01:12.390
then you can give it a new sequence

32
00:01:12.390 --> 00:01:14.760
and figure out what comes next after that.

33
00:01:14.760 --> 00:01:16.650
And through understanding that,

34
00:01:16.650 --> 00:01:18.540
through being able to predict the next token,

35
00:01:18.540 --> 00:01:21.480
you have to understand quite a lot about how the world works

36
00:01:21.480 --> 00:01:23.760
because you need to know how things are associated

37
00:01:23.760 --> 00:01:25.620
and therefore we get the magic

38
00:01:25.620 --> 00:01:27.150
that is a large language model.

39
00:01:27.150 --> 00:01:29.970
ChatGPT specifically is based on GPT-4,

40
00:01:29.970 --> 00:01:32.250
which is a transformer model.

41
00:01:32.250 --> 00:01:33.450
Transformer models

42
00:01:33.450 --> 00:01:35.970
coming from the "Attention Is All You Need" paper

43
00:01:35.970 --> 00:01:37.080
which is a famous paper,

44
00:01:37.080 --> 00:01:39.240
but I'm not gonna run you through too much

45
00:01:39.240 --> 00:01:40.560
of the jargon here.

46
00:01:40.560 --> 00:01:41.817
It won't mean too much to you.

47
00:01:41.817 --> 00:01:44.437
You just need to know is that this is,

48
00:01:44.437 --> 00:01:46.950
you know, the current kind of state of the art

49
00:01:46.950 --> 00:01:48.480
of this type of model.

50
00:01:48.480 --> 00:01:50.580
We don't have another type of model

51
00:01:50.580 --> 00:01:53.460
other than transform models that is doing as good.

52
00:01:53.460 --> 00:01:55.110
But you know, scientists are working

53
00:01:55.110 --> 00:01:57.060
on other types of architecture as well.

54
00:01:57.060 --> 00:01:58.740
So how did they train ChatGPT?

55
00:01:58.740 --> 00:02:00.510
Well, it wasn't just the transformer model.

56
00:02:00.510 --> 00:02:02.340
That was pretty good on its own

57
00:02:02.340 --> 00:02:04.110
and that's where we had GPT-3.

58
00:02:04.110 --> 00:02:07.650
But to get ChatGPT to actually kind of make it behave

59
00:02:07.650 --> 00:02:09.900
like an assistant, we had to have additional steps.

60
00:02:09.900 --> 00:02:12.750
So after training the initial, you know,

61
00:02:12.750 --> 00:02:15.960
predict the next token model, the foundation model,

62
00:02:15.960 --> 00:02:19.110
then they had to do some fine tuning on top.

63
00:02:19.110 --> 00:02:21.570
They would have humans figure out

64
00:02:21.570 --> 00:02:25.650
which outputs were best and worst from different questions,

65
00:02:25.650 --> 00:02:27.810
and then they would use that to reward the model.

66
00:02:27.810 --> 00:02:30.720
So they would predict tokens that, you know,

67
00:02:30.720 --> 00:02:33.360
humans thought were good answers to questions.

68
00:02:33.360 --> 00:02:37.530
Then they had to develop a type of policy reward model,

69
00:02:37.530 --> 00:02:39.300
PPO reinforcement learning.

70
00:02:39.300 --> 00:02:41.190
And what that would allow you to do

71
00:02:41.190 --> 00:02:43.260
is take a prompt or a sample from the dataset,

72
00:02:43.260 --> 00:02:46.950
use that PPO model to generate the output,

73
00:02:46.950 --> 00:02:50.370
and then the reward model calculates a reward for the output

74
00:02:50.370 --> 00:02:53.490
and the reward is used to update the policy using PPO.

75
00:02:53.490 --> 00:02:55.500
So this is kind of an additional set

76
00:02:55.500 --> 00:02:57.600
of reinforcement lending on top.

77
00:02:57.600 --> 00:03:00.480
Again, technical jargon, but all that means really

78
00:03:00.480 --> 00:03:03.120
is that you have the transformer model

79
00:03:03.120 --> 00:03:04.650
that's trained on the internet,

80
00:03:04.650 --> 00:03:08.520
then you fine tune it based on what people find helpful

81
00:03:08.520 --> 00:03:10.980
and useful in terms of responses,

82
00:03:10.980 --> 00:03:13.320
and then you optimize it based on policies

83
00:03:13.320 --> 00:03:15.450
so that when you get the results,

84
00:03:15.450 --> 00:03:18.840
you can be sure that they're much better on average

85
00:03:18.840 --> 00:03:20.010
than what you would get

86
00:03:20.010 --> 00:03:22.020
if you just used the foundation model.

87
00:03:22.020 --> 00:03:23.730
It was created by OpenAI,

88
00:03:23.730 --> 00:03:27.810
You can access it at chat.openai.com or chatgpt.com.

89
00:03:27.810 --> 00:03:30.270
They have a free version that you can try out

90
00:03:30.270 --> 00:03:32.940
and the free version is pretty comprehensive and generous.

91
00:03:32.940 --> 00:03:35.040
But you can also buy plus or pro version.

92
00:03:35.040 --> 00:03:37.380
The pro version's $200 a month currently,

93
00:03:37.380 --> 00:03:39.840
and you get some additional powerful models for that.

94
00:03:39.840 --> 00:03:42.150
The features are numerous,

95
00:03:42.150 --> 00:03:43.950
and we're not gonna cover all of them here,

96
00:03:43.950 --> 00:03:47.370
but the main things that we find are useful day-to-day

97
00:03:47.370 --> 00:03:48.480
are the chat history.

98
00:03:48.480 --> 00:03:50.640
So being able to see what are the, you know,

99
00:03:50.640 --> 00:03:52.590
previous chats that you've had,

100
00:03:52.590 --> 00:03:54.030
just the ability to generate text

101
00:03:54.030 --> 00:03:56.850
so you can ask it for a blog post

102
00:03:56.850 --> 00:03:58.890
or explain how the solar system was made

103
00:03:58.890 --> 00:04:00.960
or whatever it is you're trying to generate.

104
00:04:00.960 --> 00:04:05.280
And then also image generation is built into ChatGPT

105
00:04:05.280 --> 00:04:06.113
through Dall-E 3.

106
00:04:06.113 --> 00:04:07.620
It has the ability to browse the web

107
00:04:07.620 --> 00:04:09.420
as you can get up to date current information,

108
00:04:09.420 --> 00:04:10.920
which is really powerful.

109
00:04:10.920 --> 00:04:13.260
The models themselves are a little bit confusing.

110
00:04:13.260 --> 00:04:14.370
There's lots of different models

111
00:04:14.370 --> 00:04:15.930
and they change all the time.

112
00:04:15.930 --> 00:04:18.690
But broadly speaking, we have GPT-4.0

113
00:04:18.690 --> 00:04:19.890
which is the workhorse.

114
00:04:19.890 --> 00:04:23.160
You know, the really good, useful and scalable model

115
00:04:23.160 --> 00:04:25.290
that most people use for most questions.

116
00:04:25.290 --> 00:04:27.600
We have some of the research preview-type models

117
00:04:27.600 --> 00:04:30.540
like GPT 4.5, which is a little bit more creative.

118
00:04:30.540 --> 00:04:32.250
And then you have some of the reasoning models

119
00:04:32.250 --> 00:04:35.160
like 0.1, 0.3 Mini, 0.1 Pro mode,

120
00:04:35.160 --> 00:04:36.930
and these are the ones where they think first

121
00:04:36.930 --> 00:04:38.520
before giving you a response,

122
00:04:38.520 --> 00:04:41.130
so they're pretty good for mathematical questions

123
00:04:41.130 --> 00:04:43.050
or things where you need a lot of logic.

124
00:04:43.050 --> 00:04:44.460
There's also custom GPTs,

125
00:04:44.460 --> 00:04:46.350
and you can think of these as custom prompts

126
00:04:46.350 --> 00:04:47.850
where they've given

127
00:04:47.850 --> 00:04:51.510
these versions of ChatGPT set of tools they can use.

128
00:04:51.510 --> 00:04:53.520
Maybe with the Expedia one, you could book a drip,

129
00:04:53.520 --> 00:04:56.520
but also they have a certain set of instructions

130
00:04:56.520 --> 00:04:57.900
that tells 'em how to behave.

131
00:04:57.900 --> 00:04:59.970
Things I use ChatGPT for most,

132
00:04:59.970 --> 00:05:01.410
I would say creating blog posts.

133
00:05:01.410 --> 00:05:04.770
They have a Canvas editor now, which is quite useful.

134
00:05:04.770 --> 00:05:06.000
Creating social media posts,

135
00:05:06.000 --> 00:05:08.370
particularly if you wanna create them in a specific style.

136
00:05:08.370 --> 00:05:11.040
It's very good at adhering to specific styles,

137
00:05:11.040 --> 00:05:12.900
and then automating tasks as well.

138
00:05:12.900 --> 00:05:14.760
Whenever I need to do something formal,

139
00:05:14.760 --> 00:05:17.400
fill in documentation, or do my taxes

140
00:05:17.400 --> 00:05:19.050
or appeal a parking ticket,

141
00:05:19.050 --> 00:05:22.053
then I tend to use ChatGPT for that.