WEBVTT

1
00:00:00.000 --> 00:00:05.000
Hi, and welcome to this ANC SHARP video on the Microsoft 18 Framework.

2
00:00:05.000 --> 00:00:10.000
In this video, I'm going to show you a neat trick about filtering data

3
00:00:10.000 --> 00:00:14.000
without passing the data to an LLM.

4
00:00:14.000 --> 00:00:17.000
And the reason why you might not want to pass the data

5
00:00:17.000 --> 00:00:22.000
is perhaps it's a big amount of data, so it will cost a lot of time and tokens,

6
00:00:22.000 --> 00:00:29.000
or it could be sensitive data, so you don't want it to be up in the cloud.

7
00:00:29.000 --> 00:00:35.000
So there's ways around this, but first, let's see what the problem is

8
00:00:35.000 --> 00:00:40.000
in terms of speed and token count.

9
00:00:40.000 --> 00:00:48.000
So in here, I have a sample, and I have, if we look into the samples,

10
00:00:48.000 --> 00:00:53.000
it's the one on other topics, AI as data filter.

11
00:00:53.000 --> 00:01:00.000
So what I have is a bunch of books, namely 100 of them,

12
00:01:00.000 --> 00:01:05.000
with just being some JSON, and we can see that each book

13
00:01:05.000 --> 00:01:10.000
have a title, a year of release, an author, a genre, and a synopsis.

14
00:01:10.000 --> 00:01:15.000
So I turn these books into data code here,

15
00:01:15.000 --> 00:01:22.000
and I put in a string on these so I get it out as some XML

16
00:01:22.000 --> 00:01:31.000
because AI are better at XML than the JSON.

17
00:01:31.000 --> 00:01:34.000
Then I make a client, and in this case,

18
00:01:34.000 --> 00:01:39.000
I'm using chat-tbt-41-mini as the first model.

19
00:01:39.000 --> 00:01:42.000
And I'm just telling it's a librarian,

20
00:01:42.000 --> 00:01:47.000
and then I'm giving all the data in the prompt,

21
00:01:47.000 --> 00:01:54.000
the prompt meaning very, very long string of text.

22
00:01:54.000 --> 00:01:58.000
And the reason I do this is what we're about to do

23
00:01:58.000 --> 00:02:01.000
is not really something RAC can do

24
00:02:01.000 --> 00:02:04.000
because we are going to ask some questions,

25
00:02:04.000 --> 00:02:10.000
like, for example, how many of these books were released in 1980?

26
00:02:10.000 --> 00:02:15.000
How many have an author that ends with the letter S

27
00:02:16.000 --> 00:02:19.000
in their name, and so on and so forth.

28
00:02:19.000 --> 00:02:22.000
So things a user could ask,

29
00:02:22.000 --> 00:02:29.000
but we can't really use anything in terms of RAC

30
00:02:29.000 --> 00:02:32.000
because that would be a similarity search

31
00:02:32.000 --> 00:02:36.000
that has nothing to do with these kinds of filters

32
00:02:36.000 --> 00:02:39.000
or what you can call it.

33
00:02:39.000 --> 00:02:42.000
So I'm going to ask three questions.

34
00:02:42.000 --> 00:02:46.000
So how many of them were released in 1980?

35
00:02:46.000 --> 00:02:49.000
Then give how many and list the books

36
00:02:49.000 --> 00:02:54.000
where the authors have an S as the last letter in their name.

37
00:02:54.000 --> 00:02:57.000
And finally, a bit more advanced,

38
00:02:57.000 --> 00:03:01.000
where have the last letter of S in their name.

39
00:03:01.000 --> 00:03:03.000
This genre is not fantasy,

40
00:03:03.000 --> 00:03:07.000
and was released between 1800 and 1900.

41
00:03:07.000 --> 00:03:13.000
So progressively more advanced questions to learn.

42
00:03:13.000 --> 00:03:21.000
So let's first see how ChattyBT 4.1 Mini handles question 1.

43
00:03:21.000 --> 00:03:26.000
So we ask the question by giving the book data and the question.

44
00:03:26.000 --> 00:03:30.000
And we can see it came back fairly quickly, 1.6 seconds,

45
00:03:30.000 --> 00:03:35.000
and it says that these two books are the right ones.

46
00:03:35.000 --> 00:03:40.000
It used 4,185 input tokens

47
00:03:40.000 --> 00:03:43.000
because we needed to give all the books,

48
00:03:43.000 --> 00:03:46.000
and output tokens were very low

49
00:03:46.000 --> 00:03:49.000
because it just needed to write this simple text out.

50
00:03:49.000 --> 00:03:52.000
And the real answer is actually 2.

51
00:03:52.000 --> 00:03:55.000
But if we try again.

52
00:03:59.000 --> 00:04:02.000
In this case, it's still clear.

53
00:04:02.000 --> 00:04:04.000
I've seen it give back 3

54
00:04:04.000 --> 00:04:06.000
because it thought there was three books.

55
00:04:06.000 --> 00:04:10.000
That's not the most important thing here.

56
00:04:10.000 --> 00:04:14.000
So let's go to some more advanced things

57
00:04:14.000 --> 00:04:19.000
where the end of the author's name should be S.

58
00:04:23.000 --> 00:04:26.000
This will begin to take a little longer.

59
00:04:26.000 --> 00:04:32.000
But it says it has 27 books, the author ending.

60
00:04:32.000 --> 00:04:34.000
And we can already see that's wrong

61
00:04:34.000 --> 00:04:37.000
because Lewis Carroll is not ending in S.

62
00:04:37.000 --> 00:04:39.000
Mark Twain is not ending in S.

63
00:04:39.000 --> 00:04:41.000
It does not even have an S in it.

64
00:04:41.000 --> 00:04:45.000
So it's definitely wrong with this,

65
00:04:45.000 --> 00:04:49.000
and the real answer is actually 15

66
00:04:49.000 --> 00:04:53.000
because I'm just using a little link here to check.

67
00:04:53.000 --> 00:04:55.000
So it definitely got it wrong,

68
00:04:55.000 --> 00:04:59.000
and despite that, it still used a lot of our tokens.

69
00:04:59.000 --> 00:05:01.000
And it goes even further wrong

70
00:05:01.000 --> 00:05:04.000
if we give it a much more advanced question here,

71
00:05:04.000 --> 00:05:09.000
like the author's name should end in S,

72
00:05:09.000 --> 00:05:11.000
it should not be a genre of fantasy,

73
00:05:11.000 --> 00:05:14.000
and it's released between 1800 and 1900.

74
00:05:14.000 --> 00:05:17.000
In this case, it says it's 14 books.

75
00:05:17.000 --> 00:05:19.000
Again, it goes wrong.

76
00:05:19.000 --> 00:05:21.000
David Copperfield is...

77
00:05:21.000 --> 00:05:23.000
Let's re-find it.

78
00:05:27.000 --> 00:05:31.000
Dostoevsky is not ending in S, for example.

79
00:05:31.000 --> 00:05:35.000
And the real answer is actually 7.

80
00:05:35.000 --> 00:05:40.000
So sometimes a model like this gets it right.

81
00:05:40.000 --> 00:05:43.000
Sometimes it doesn't.

82
00:05:43.000 --> 00:05:48.000
So let's move this all the way back up here

83
00:05:48.000 --> 00:05:52.000
and try with another model.

84
00:05:53.000 --> 00:05:56.000
So now we are going in and hot reloading,

85
00:05:56.000 --> 00:06:00.000
so we use Chatterjeeb T5 Mini instead

86
00:06:00.000 --> 00:06:02.000
in order to do the same.

87
00:06:06.000 --> 00:06:08.000
I'm going to get our three questions,

88
00:06:08.000 --> 00:06:11.000
and we get the first answer back.

89
00:06:11.000 --> 00:06:12.000
Now it takes longer

90
00:06:12.000 --> 00:06:15.000
because now we are having a reasoning model,

91
00:06:15.000 --> 00:06:19.000
and it will actually reason over what this is.

92
00:06:19.000 --> 00:06:24.000
And it comes back again with the same two books, which is good,

93
00:06:24.000 --> 00:06:28.000
where the other one, the ones you saw here, it could handle it,

94
00:06:28.000 --> 00:06:31.000
but in some cases it can't.

95
00:06:31.000 --> 00:06:39.000
A model like Chatterjeeb T5 Mini is pretty good at this, so it's okay,

96
00:06:39.000 --> 00:06:41.000
and if we went to one of the even bigger models,

97
00:06:41.000 --> 00:06:42.000
it would be even better.

98
00:06:42.000 --> 00:06:44.000
But we are still using a lot of tokens,

99
00:06:44.000 --> 00:06:49.000
and now we are even using more in output due to reasoning.

100
00:06:50.000 --> 00:06:55.000
If we go to question two,

101
00:06:55.000 --> 00:06:58.000
it now becomes much, much harder for it

102
00:06:58.000 --> 00:07:03.000
because now it needs to really reason over if this is right.

103
00:07:03.000 --> 00:07:10.000
So I've seen these take up to half a minute to run back,

104
00:07:10.000 --> 00:07:15.000
despite it being a fairly simple link query,

105
00:07:15.000 --> 00:07:19.000
if you knew what you were doing, of course.

106
00:07:19.000 --> 00:07:26.000
But it will think about it quite a while,

107
00:07:26.000 --> 00:07:31.000
and hopefully come back in one second.

108
00:07:32.000 --> 00:07:39.000
Yeah, it's a difficult question among 100 books,

109
00:07:39.000 --> 00:07:41.000
but now it comes back,

110
00:07:41.000 --> 00:07:44.000
and it comes back with the answer of 15,

111
00:07:44.000 --> 00:07:46.000
which is actually the right answer.

112
00:07:46.000 --> 00:07:49.000
So reasoning does help with things like this,

113
00:07:49.000 --> 00:07:52.000
and it's quite good at it.

114
00:07:52.000 --> 00:07:56.000
But we again use a lot of tokens, and now you can see

115
00:07:56.000 --> 00:08:00.000
we're really using a lot of reasoning tokens.

116
00:08:01.000 --> 00:08:05.000
But the real answer is 15. We got it right.

117
00:08:06.000 --> 00:08:12.000
But now let's go and do the advanced question

118
00:08:12.000 --> 00:08:15.000
with both the letter S, not fantasy,

119
00:08:15.000 --> 00:08:19.000
and release between 1800 and 1900.

120
00:08:19.000 --> 00:08:22.000
I will just pause the video here when it's back

121
00:08:22.000 --> 00:08:25.000
because this will take a little long time.

122
00:08:25.000 --> 00:08:27.000
And we are back.

123
00:08:27.000 --> 00:08:31.000
You see it fought for 32 seconds in order to do this.

124
00:08:31.000 --> 00:08:36.000
It got the answer right, which is 7, which we'll see in one second.

125
00:08:36.000 --> 00:08:41.000
But it used a lot of tokens again in order to do this.

126
00:08:41.000 --> 00:08:45.000
But at least it gets it right.

127
00:08:45.000 --> 00:08:49.000
But what I want to show you now is that there is a better way.

128
00:08:49.000 --> 00:08:54.000
And we can even go up here.

129
00:08:57.000 --> 00:08:59.000
Switch back to the simple model

130
00:08:59.000 --> 00:09:01.000
because even the simple model can do it

131
00:09:01.000 --> 00:09:05.000
in the other way I want to show you.

132
00:09:05.000 --> 00:09:09.000
So we're going to load up the model,

133
00:09:09.000 --> 00:09:12.000
and we're going to load up the three questions.

134
00:09:12.000 --> 00:09:18.000
But then I'm going to jump down to some new code down here.

135
00:09:18.000 --> 00:09:20.000
And what we're going to do here is

136
00:09:20.000 --> 00:09:25.000
instead of actually giving the model all the books,

137
00:09:25.000 --> 00:09:28.000
meaning the 100 books in XML format,

138
00:09:28.000 --> 00:09:32.000
we're instead going to ask it to make a filter for those books.

139
00:09:32.000 --> 00:09:37.000
And the book filter is just we want to have the fields,

140
00:09:37.000 --> 00:09:42.000
the comparison operation, and the value back.

141
00:09:42.000 --> 00:09:45.000
So we have among title, year of release,

142
00:09:45.000 --> 00:09:48.000
author, genre, and synopsis.

143
00:09:48.000 --> 00:09:51.000
We have equals, not equals, start with, end with,

144
00:09:51.000 --> 00:09:54.000
contains greater than, greater than equal,

145
00:09:54.000 --> 00:09:57.000
less than, less than equal, and regex.

146
00:09:57.000 --> 00:10:01.000
And then a value, just represent a string.

147
00:10:01.000 --> 00:10:04.000
But we could have value as integer, value as string.

148
00:10:04.000 --> 00:10:09.000
But it's fairly OK to do it here.

149
00:10:09.000 --> 00:10:14.000
So see what happens now when I just say, hey, LLM,

150
00:10:14.000 --> 00:10:20.000
make a filter for the following query and then the question.

151
00:10:20.000 --> 00:10:24.000
It is almost instance that it can make a filter.

152
00:10:24.000 --> 00:10:31.000
And we get back year of release equals 1980.

153
00:10:31.000 --> 00:10:33.000
Because it's an easy question now.

154
00:10:33.000 --> 00:10:36.000
It doesn't have all the token count.

155
00:10:36.000 --> 00:10:40.000
And this response, if we look at it,

156
00:10:40.000 --> 00:10:44.000
it only used 154 tokens.

157
00:10:44.000 --> 00:10:47.000
So very, very simple to do.

158
00:10:47.000 --> 00:10:53.000
And then, of course, it's our job to filter the books now,

159
00:10:53.000 --> 00:10:57.000
which I do in this, which is two.

160
00:10:57.000 --> 00:10:59.000
And this is not fun code to write.

161
00:10:59.000 --> 00:11:04.000
But I think it's worth getting rid of 30 seconds of wait time

162
00:11:04.000 --> 00:11:11.000
and 7,000 tokens by sitting down and writing this code.

163
00:11:11.000 --> 00:11:15.000
It can be more advanced if you have more advanced samples, of course.

164
00:11:15.000 --> 00:11:18.000
But in this case, I'm just going through the filters

165
00:11:18.000 --> 00:11:23.000
and slowly applying the filter based on what it gave back.

166
00:11:27.000 --> 00:11:30.000
I'll let it run to here.

167
00:11:30.000 --> 00:11:32.000
We can ask the second question.

168
00:11:32.000 --> 00:11:35.000
And again, it can easily make that filter

169
00:11:35.000 --> 00:11:39.000
because it's really, really good at understanding

170
00:11:39.000 --> 00:11:41.000
what we are talking about.

171
00:11:41.000 --> 00:11:44.000
Because it doesn't have all the context to work with.

172
00:11:44.000 --> 00:11:49.000
So order ends with s. Simple to do.

173
00:11:49.000 --> 00:11:50.000
And again, we get the books back.

174
00:11:50.000 --> 00:11:54.000
And we have the 15.

175
00:11:54.000 --> 00:11:58.000
And the final one, which is a much more advanced question,

176
00:11:58.000 --> 00:12:01.000
it actually goes fairly quick as well

177
00:12:01.000 --> 00:12:04.000
and turn it into a filter of four different things.

178
00:12:04.000 --> 00:12:07.000
Order ends with s.

179
00:12:07.000 --> 00:12:11.000
Genre not equal fantasy.

180
00:12:11.000 --> 00:12:14.000
Year for lease greater than or equal to 18.

181
00:12:14.000 --> 00:12:21.000
And year for lease less than or equal 1900.

182
00:12:21.000 --> 00:12:23.000
And again, we get the books.

183
00:12:23.000 --> 00:12:28.000
And it has the seven books.

184
00:12:28.000 --> 00:12:34.000
So this is a much more advanced way.

185
00:12:34.000 --> 00:12:37.000
It saves a lot of time.

186
00:12:37.000 --> 00:12:40.000
And it gives the right books back.

187
00:12:40.000 --> 00:12:43.000
It can tend to end up being a little more wooden answer

188
00:12:43.000 --> 00:12:46.000
because now you just have the books

189
00:12:46.000 --> 00:12:49.000
and you need to show the answer.

190
00:12:49.000 --> 00:12:51.000
So it doesn't show the normal.

191
00:12:51.000 --> 00:12:56.000
I looked at it and it's this and this number of books and so on.

192
00:12:56.000 --> 00:13:00.000
But often you want to have it in a structured format anyway.

193
00:13:00.000 --> 00:13:05.000
So giving it the books like we did up here

194
00:13:05.000 --> 00:13:10.000
is just waste of time, in my opinion, in certain scenarios.

195
00:13:10.000 --> 00:13:14.000
And letting it build filters like these down here,

196
00:13:14.000 --> 00:13:16.000
which is fairly simple to do.

197
00:13:16.000 --> 00:13:21.000
Then we, of course, have the annoying part of taking the filter

198
00:13:21.000 --> 00:13:24.000
and turning it and applying it.

199
00:13:24.000 --> 00:13:26.000
But it's fairly simple code.

200
00:13:26.000 --> 00:13:29.000
It took me like five, ten minutes to write this code.

201
00:13:29.000 --> 00:13:33.000
So I definitely think it's worth doing filters

202
00:13:33.000 --> 00:13:36.000
instead of the system.

203
00:13:36.000 --> 00:13:40.000
And again, if the data had been very sensitive,

204
00:13:40.000 --> 00:13:43.000
we could still have used an LLM to filter it,

205
00:13:43.000 --> 00:13:46.000
which is really, really cool.

206
00:13:46.000 --> 00:13:50.000
So that is how we can do it.

207
00:13:50.000 --> 00:13:52.000
And thank you for seeing this video.

208
00:13:52.000 --> 00:13:54.000
See you in the next one.