WEBVTT

1
00:00:01.070 --> 00:00:03.850
Finally now have all the building blocks ready

2
00:00:03.850 --> 00:00:07.610
in order to do the search in the

3
00:00:07.610 --> 00:00:07.890
RAG.

4
00:00:09.230 --> 00:00:11.350
So what we're going to do here is

5
00:00:11.350 --> 00:00:14.050
we're going to build a chat loop where

6
00:00:14.050 --> 00:00:15.490
we can actually use the RAG.

7
00:00:16.170 --> 00:00:17.930
So in order to do that we make

8
00:00:17.930 --> 00:00:21.850
the normal connection, we define the embedding generator

9
00:00:22.590 --> 00:00:25.790
and we need that because when we search

10
00:00:25.790 --> 00:00:27.950
we also need the embedding generator not only

11
00:00:27.950 --> 00:00:31.610
when we ingest data because when we send

12
00:00:31.610 --> 00:00:33.690
it we need to send it as a

13
00:00:33.690 --> 00:00:35.990
vector instead of as text.

14
00:00:37.670 --> 00:00:39.890
The vector store will take care of that

15
00:00:39.890 --> 00:00:43.270
but just so we know and in real

16
00:00:43.270 --> 00:00:46.050
life things like these where you embed make

17
00:00:46.050 --> 00:00:47.850
the generator and so on you'll put in

18
00:00:47.850 --> 00:00:51.410
dependency injection or some helper tool and not

19
00:00:51.410 --> 00:00:52.990
do it all over the place.

20
00:00:53.250 --> 00:00:55.350
This is just so it's easier for you

21
00:00:55.350 --> 00:00:56.910
to understand in this example.

22
00:00:57.850 --> 00:01:01.410
So we define our embedding generator, we make

23
00:01:01.410 --> 00:01:03.930
our vector store which will just connect to

24
00:01:03.930 --> 00:01:07.790
the existing one we have and then we

25
00:01:07.790 --> 00:01:09.290
are gonna get our collection.

26
00:01:11.710 --> 00:01:14.870
So we have our collection to search against

27
00:01:14.870 --> 00:01:16.890
because we're not searching against the vector store

28
00:01:16.890 --> 00:01:22.310
we're always searching against a specific table or

29
00:01:22.310 --> 00:01:25.230
index depending on what it is you have

30
00:01:25.230 --> 00:01:29.050
and then we're making an agent so we're

31
00:01:29.050 --> 00:01:32.130
back to agents and not just vectors and

32
00:01:32.130 --> 00:01:34.890
this is an expert in the company's internal

33
00:01:34.890 --> 00:01:35.670
non-space.

34
00:01:37.790 --> 00:01:39.830
And then we start our normal chat loop.

35
00:01:40.590 --> 00:01:43.030
So now it's waiting for me to answer

36
00:01:43.030 --> 00:01:45.970
set a question and let's ask it in

37
00:01:45.970 --> 00:01:48.070
a slightly different way so we can see

38
00:01:48.070 --> 00:01:48.790
what's going on.

39
00:01:48.890 --> 00:01:56.230
So let's say what is the password to

40
00:01:56.230 --> 00:01:57.530
the internet.

41
00:01:59.460 --> 00:02:02.320
It could be that the specific user don't

42
00:02:02.320 --> 00:02:04.300
know that it's called a wi-fi so

43
00:02:04.300 --> 00:02:05.920
they just want to get onto the internet

44
00:02:05.920 --> 00:02:07.419
and it's asking for a password.

45
00:02:09.400 --> 00:02:17.180
If we do this let's press the right

46
00:02:17.180 --> 00:02:25.200
buttons here and we get our input and

47
00:02:25.200 --> 00:02:26.840
what we're going to do is we're going

48
00:02:26.840 --> 00:02:31.440
to make a similarity search against the collection

49
00:02:31.440 --> 00:02:33.720
and the way we do that is we

50
00:02:33.720 --> 00:02:37.340
tell our collection to do a search async

51
00:02:38.070 --> 00:02:40.000
with the input in our case.

52
00:02:40.720 --> 00:02:43.740
We could have made the AI enhance the

53
00:02:43.740 --> 00:02:47.500
question turn it into keywords and stuff like

54
00:02:47.500 --> 00:02:47.800
that.

55
00:02:48.540 --> 00:02:52.720
That's all the parts of optimising and figuring

56
00:02:52.720 --> 00:02:54.880
out how the best way to search but

57
00:02:54.880 --> 00:02:56.600
in our case we're just doing the crude

58
00:02:56.600 --> 00:03:00.940
and actually giving the vector to find something

59
00:03:00.940 --> 00:03:02.880
similar to what the user have asked.

60
00:03:04.400 --> 00:03:07.080
And what we do is we say among

61
00:03:07.080 --> 00:03:09.940
all the records give us back the top

62
00:03:09.940 --> 00:03:12.060
three most relevant.

63
00:03:13.440 --> 00:03:15.300
In our case when we only have 10

64
00:03:15.300 --> 00:03:17.540
that might be a little much but in

65
00:03:17.540 --> 00:03:20.060
other cases where you have thousand it could

66
00:03:20.060 --> 00:03:24.000
be that you need to get the top

67
00:03:24.000 --> 00:03:25.460
20 or something like that.

68
00:03:26.480 --> 00:03:29.220
But in our case we make the search

69
00:03:29.220 --> 00:03:31.220
so behind the scenes when we do this

70
00:03:31.220 --> 00:03:34.860
search it will actually take this sentence what

71
00:03:34.860 --> 00:03:37.500
is the password to the internet turn it

72
00:03:37.500 --> 00:03:40.160
into a vector and then send that to

73
00:03:40.160 --> 00:03:42.640
the vector store for doing the similarity search

74
00:03:42.640 --> 00:03:45.560
a little like we did in the first

75
00:03:45.560 --> 00:03:47.240
lecture of this.

76
00:03:48.640 --> 00:03:52.460
So it will come back and the first

77
00:03:52.460 --> 00:03:56.660
record is luckily for us the question of

78
00:03:56.660 --> 00:03:59.300
what is the wi-fi password for the

79
00:03:59.300 --> 00:04:03.400
office and what is the guest and we

80
00:04:03.400 --> 00:04:07.180
will see that the score is quite low

81
00:04:07.180 --> 00:04:09.940
and i will explain that in a little

82
00:04:09.940 --> 00:04:10.360
while.

83
00:04:15.860 --> 00:04:18.360
Because before we saw that our score should

84
00:04:18.360 --> 00:04:22.180
be close to one when it was when

85
00:04:22.180 --> 00:04:25.340
it was high and then we get another

86
00:04:25.340 --> 00:04:28.220
question this question is something about logging into

87
00:04:28.220 --> 00:04:31.140
my office account and the reason why this

88
00:04:31.140 --> 00:04:34.020
gets a score that is relevant is because

89
00:04:34.020 --> 00:04:36.660
the answer contains the word password and we

90
00:04:36.660 --> 00:04:38.360
use password in this.

91
00:04:41.020 --> 00:04:43.800
This one have a score that is higher

92
00:04:44.500 --> 00:04:46.600
and i will again explain that when we

93
00:04:46.600 --> 00:04:48.040
see the final result.

94
00:04:50.460 --> 00:04:55.680
This one is a question about who is

95
00:04:55.680 --> 00:04:59.420
in charge of support and my guess is

96
00:04:59.420 --> 00:05:01.700
why this is relevant is because it's showing

97
00:05:01.700 --> 00:05:04.520
an email and email and internet are close

98
00:05:04.520 --> 00:05:06.720
to each other so for that reason it

99
00:05:06.720 --> 00:05:07.360
gets a score.

100
00:05:10.160 --> 00:05:12.540
And that is our tree so we get

101
00:05:12.540 --> 00:05:17.140
our report back here and we can actually

102
00:05:17.140 --> 00:05:18.260
just see it on screen.

103
00:05:19.060 --> 00:05:21.560
So what you can see is we get

104
00:05:21.560 --> 00:05:24.600
a higher and higher score and this is

105
00:05:24.600 --> 00:05:27.320
an important thing for you to know because

106
00:05:27.320 --> 00:05:31.020
when i showed you our C-sharp similarity

107
00:05:31.020 --> 00:05:33.720
scores we talked about one being the best

108
00:05:33.720 --> 00:05:36.780
and zero being the worst.

109
00:05:37.520 --> 00:05:40.640
But in terms of a SQLite they have

110
00:05:40.640 --> 00:05:42.340
chosen a different metric.

111
00:05:43.000 --> 00:05:45.800
They have chosen that the closer to zero

112
00:05:45.800 --> 00:05:46.560
the better.

113
00:05:47.900 --> 00:05:51.340
So if you begin to set some limits

114
00:05:51.340 --> 00:05:53.960
on i will only i would like to

115
00:05:53.960 --> 00:05:56.480
get 10 records back but only if they

116
00:05:56.480 --> 00:05:59.160
are above this score you would actually get

117
00:05:59.160 --> 00:06:03.120
a worse thing here because it would be

118
00:06:03.120 --> 00:06:04.300
the least score.

119
00:06:04.440 --> 00:06:06.840
So you need to know that your specific

120
00:06:06.840 --> 00:06:12.460
way of getting scores back is low to

121
00:06:12.460 --> 00:06:14.960
high in this case while others it's high

122
00:06:14.960 --> 00:06:18.720
to low while sometimes it's negative as well

123
00:06:18.720 --> 00:06:24.720
all depending on what similarity formula they're using.

124
00:06:25.520 --> 00:06:28.480
You can also just not care because this

125
00:06:28.480 --> 00:06:30.720
search will always give the most relevant back

126
00:06:30.720 --> 00:06:32.780
first no matter what.

127
00:06:33.040 --> 00:06:34.620
So that's the reason why we get this

128
00:06:34.620 --> 00:06:34.920
one.

129
00:06:36.560 --> 00:06:39.440
So we have our scores now and what

130
00:06:39.440 --> 00:06:41.640
we're gonna do is we're gonna put that

131
00:06:41.640 --> 00:06:45.340
into a message together with the real question

132
00:06:45.340 --> 00:06:47.400
and we are going to say here's the

133
00:06:47.400 --> 00:06:49.660
most relevant knowledge base information.

134
00:06:50.500 --> 00:06:53.340
So we are telling among all the 10

135
00:06:53.340 --> 00:06:56.760
which it's not getting it's only getting these

136
00:06:56.760 --> 00:06:59.600
three and among those figure out what the

137
00:06:59.600 --> 00:07:00.720
answer is.

138
00:07:01.620 --> 00:07:06.420
So here we have our two messages we're

139
00:07:06.420 --> 00:07:06.900
sending in.

140
00:07:07.200 --> 00:07:09.060
Before we have only sent one in here

141
00:07:09.060 --> 00:07:11.700
but now we can actually send multiple in

142
00:07:11.700 --> 00:07:15.880
if we wish to like this and then

143
00:07:15.880 --> 00:07:17.700
we get our response back.

144
00:07:18.620 --> 00:07:23.040
So if we see and it is good

145
00:07:23.040 --> 00:07:26.280
enough to know that among these three the

146
00:07:26.280 --> 00:07:28.300
right one is the first one.

147
00:07:28.660 --> 00:07:32.200
So even if some kind of similarity had

148
00:07:32.200 --> 00:07:34.740
hit this one to be the most relevant

149
00:07:34.740 --> 00:07:37.520
the AI would still know oh that has

150
00:07:37.520 --> 00:07:40.320
nothing to do with Wi-Fi passwords so

151
00:07:40.320 --> 00:07:41.580
it must be the second best.

152
00:07:42.340 --> 00:07:44.100
So that's the reason why we only we

153
00:07:44.100 --> 00:07:46.240
don't only send the one best back.

154
00:07:46.580 --> 00:07:49.360
Sometimes there can be a race condition of

155
00:07:49.360 --> 00:07:51.420
what is the most similarity.

156
00:07:52.420 --> 00:07:54.820
So we sent like three or ten or

157
00:07:54.820 --> 00:07:59.140
that's also a science to figure out should

158
00:07:59.140 --> 00:08:02.160
we send because the more we send the

159
00:08:02.160 --> 00:08:03.320
more we give us input.

160
00:08:03.500 --> 00:08:06.600
All this will cost input tokens as well.

161
00:08:07.100 --> 00:08:09.980
So had we sent 100 it would have

162
00:08:09.980 --> 00:08:12.220
cost more to answer this question.

163
00:08:12.480 --> 00:08:14.900
Do we send only one it will cost

164
00:08:14.900 --> 00:08:17.860
less but with the chance that it was

165
00:08:17.860 --> 00:08:21.240
not the real answer because then it suddenly

166
00:08:21.240 --> 00:08:24.420
goes from being kind of smart to absolutely

167
00:08:24.420 --> 00:08:26.700
dumb because it didn't get the information in

168
00:08:26.700 --> 00:08:27.400
the first place.

169
00:08:30.720 --> 00:08:33.200
So if we just try and run it

170
00:08:33.200 --> 00:08:37.120
again without the breakpoints just so we can

171
00:08:37.120 --> 00:08:37.960
see the experience.

172
00:08:42.460 --> 00:08:47.400
What is the Wi-Fi password?

173
00:08:50.800 --> 00:08:53.480
Again it will come as the top one.

174
00:08:53.860 --> 00:08:57.360
While there was another question about Christmas.

175
00:08:57.860 --> 00:09:03.280
So is Xmas a day off?

176
00:09:06.140 --> 00:09:09.960
It will now come back with other search

177
00:09:09.960 --> 00:09:10.460
results.

178
00:09:11.100 --> 00:09:13.160
So again it found it because it's the

179
00:09:13.160 --> 00:09:16.600
only one mentioning eve and so on.

180
00:09:17.600 --> 00:09:20.080
But if we do something else let's say

181
00:09:20.080 --> 00:09:20.420
hello.

182
00:09:22.420 --> 00:09:25.200
It will still search for something and that

183
00:09:25.200 --> 00:09:27.280
is actually what we will cover in the

184
00:09:27.280 --> 00:09:31.400
next lecture because we don't really want to

185
00:09:31.400 --> 00:09:33.800
have it search anything at this point because

186
00:09:33.800 --> 00:09:35.640
we're just saying hello to the AI.

187
00:09:36.960 --> 00:09:38.580
But that's for the next lecture.
