WEBVTT

1
00:00:00.650 --> 00:00:02.790
Hi, and welcome to this AI in C#

2
00:00:02.790 --> 00:00:05.050
video on the Microsoft Agent Framework.

3
00:00:05.530 --> 00:00:07.330
In this video, we're going to look at

4
00:00:07.330 --> 00:00:10.190
a very cool concept for adding user memory

5
00:00:10.190 --> 00:00:11.130
to your agents.

6
00:00:11.770 --> 00:00:15.930
It's called AIContextProvider, and we're using a custom

7
00:00:15.930 --> 00:00:17.510
version of it in this video.

8
00:00:19.470 --> 00:00:22.970
And to let you see what it is,

9
00:00:23.390 --> 00:00:25.870
I will start actually with running it a

10
00:00:25.870 --> 00:00:28.110
few times and then show you the code.

11
00:00:28.110 --> 00:00:31.689
Because here I have a normal agent, but

12
00:00:31.689 --> 00:00:33.150
I have added memory to it.

13
00:00:33.570 --> 00:00:35.790
But if I ask it, for example, what

14
00:00:35.790 --> 00:00:44.690
is my name, it doesn't know.

15
00:00:45.290 --> 00:00:48.030
That is quite normal for an agent.

16
00:00:48.290 --> 00:00:49.790
Of course, I need to tell it in

17
00:00:49.790 --> 00:00:52.350
a conversation thread, and then it will begin

18
00:00:52.350 --> 00:00:52.930
to understand.

19
00:00:54.210 --> 00:00:57.230
So I could, for example, say, my name

20
00:00:57.230 --> 00:00:58.230
is Rasmus.

21
00:00:58.230 --> 00:01:03.959
And now it knows, so if I say,

22
00:01:04.300 --> 00:01:09.400
what is my name, it now knows.

23
00:01:10.220 --> 00:01:13.060
But we all know that when we run

24
00:01:13.060 --> 00:01:15.580
these and then just close it and start

25
00:01:15.580 --> 00:01:21.080
it again, it normally ends up not knowing

26
00:01:21.080 --> 00:01:25.500
again, because we don't have any persistence in

27
00:01:25.500 --> 00:01:25.840
any way.

28
00:01:25.840 --> 00:01:29.640
We could persist the entire conversation, but let's

29
00:01:29.640 --> 00:01:32.520
say I just wanted to talk with the

30
00:01:32.520 --> 00:01:34.200
agent again for something new.

31
00:01:34.620 --> 00:01:37.200
But now if I say, what is my

32
00:01:37.200 --> 00:01:41.460
name, it actually knows.

33
00:01:42.980 --> 00:01:45.160
And I can also give it other information.

34
00:01:45.500 --> 00:01:55.090
For example, my favourite is blue,

35
00:01:55.530 --> 00:02:00.700
and I live in Denmark.

36
00:02:05.460 --> 00:02:10.180
And again, if I stop the code and

37
00:02:10.180 --> 00:02:15.150
I start it again, I can now ask,

38
00:02:15.790 --> 00:02:21.030
what do you know about me?

39
00:02:24.730 --> 00:02:27.030
It now knows three facts, that my name

40
00:02:27.030 --> 00:02:29.710
is Rasmus, my favourite colour is blue, and

41
00:02:29.710 --> 00:02:31.150
I live in Denmark.

42
00:02:31.150 --> 00:02:37.250
And I could say, forget where I live,

43
00:02:38.070 --> 00:02:44.960
and start and stop again.

44
00:02:55.940 --> 00:03:00.910
And now we can see it only knows

45
00:03:00.910 --> 00:03:01.510
two things.

46
00:03:02.610 --> 00:03:03.810
So how is this achieved?

47
00:03:04.610 --> 00:03:05.910
Let's go into the code.

48
00:03:09.800 --> 00:03:12.080
The sample is up here in an AI

49
00:03:12.080 --> 00:03:17.120
context provider factory, where we use the custom

50
00:03:17.120 --> 00:03:17.520
version.

51
00:03:18.380 --> 00:03:20.080
There's other versions, but we will save them

52
00:03:20.080 --> 00:03:21.220
for other videos.

53
00:03:22.120 --> 00:03:25.060
So what I have done here is, I

54
00:03:25.060 --> 00:03:26.540
have made user memory.

55
00:03:27.380 --> 00:03:29.760
And in order to do user memory in

56
00:03:29.760 --> 00:03:32.000
my book, then we have a user ID.

57
00:03:32.000 --> 00:03:35.080
So this is me simulating that I have

58
00:03:35.080 --> 00:03:38.320
logged in with this user, rwj1234.

59
00:03:40.480 --> 00:03:44.120
Then I do a normal OpenAI client.

60
00:03:44.980 --> 00:03:47.000
And then I make something called a memory

61
00:03:47.000 --> 00:03:49.100
extractor agent.

62
00:03:49.480 --> 00:03:51.680
And I use the cheapest method possible, because

63
00:03:51.680 --> 00:03:53.420
it will run fairly often.

64
00:03:54.640 --> 00:03:57.380
And I've given it an instruction that says,

65
00:03:57.380 --> 00:04:00.280
look at the user's message and extract any

66
00:04:00.280 --> 00:04:04.240
memory that we do not already know, or

67
00:04:04.240 --> 00:04:07.060
none if there aren't any memories to store.

68
00:04:08.260 --> 00:04:10.860
So this agent is by itself, and we're

69
00:04:10.860 --> 00:04:12.720
not really calling it here.

70
00:04:12.940 --> 00:04:16.200
Instead, we are adding our agent with custom

71
00:04:16.200 --> 00:04:16.740
memory.

72
00:04:18.300 --> 00:04:21.019
And we are just making the agent like

73
00:04:21.019 --> 00:04:25.480
normal, and giving instructions, but that's just your

74
00:04:25.480 --> 00:04:26.100
nice AI.

75
00:04:26.100 --> 00:04:30.140
So we're not telling this user memory upfront

76
00:04:30.140 --> 00:04:30.720
or anything.

77
00:04:31.640 --> 00:04:34.160
But what we are providing is this AI

78
00:04:34.160 --> 00:04:37.560
context factory, which sits on the chat client

79
00:04:37.560 --> 00:04:38.620
agent options.

80
00:04:39.960 --> 00:04:42.020
And if we give one of these, we

81
00:04:42.020 --> 00:04:46.440
can make one of them ourself, where we

82
00:04:46.440 --> 00:04:49.920
then give the memory extractor agent and the

83
00:04:49.920 --> 00:04:51.080
user for it.

84
00:04:52.320 --> 00:04:53.820
And before we go and look at that,

85
00:04:53.820 --> 00:04:56.080
we just see that else we do a

86
00:04:56.080 --> 00:04:58.960
normal thread, and we just ask the user

87
00:04:58.960 --> 00:04:59.720
again and again.

88
00:04:59.880 --> 00:05:01.740
So we're not putting in, we're not saving

89
00:05:01.740 --> 00:05:05.780
anything in the normal code.

90
00:05:07.000 --> 00:05:11.040
But this one is special in that it

91
00:05:11.040 --> 00:05:16.260
receives an agent, and it receives the user

92
00:05:16.260 --> 00:05:16.540
ID.

93
00:05:16.540 --> 00:05:20.880
And what I do here is I tell

94
00:05:20.880 --> 00:05:26.960
that this user's memory is in this specific

95
00:05:26.960 --> 00:05:28.360
file in my temp folder.

96
00:05:29.200 --> 00:05:30.580
So if I look in my temp folder,

97
00:05:30.760 --> 00:05:34.180
there's actually a file here with my memory,

98
00:05:34.640 --> 00:05:36.160
which is the two things we had.

99
00:05:37.560 --> 00:05:39.740
And this is just, of course, simulating some

100
00:05:39.740 --> 00:05:40.700
kind of data store.

101
00:05:40.700 --> 00:05:44.780
It could have been a database, a blob

102
00:05:44.780 --> 00:05:47.580
storage, or whatever we wanted to save this

103
00:05:47.580 --> 00:05:47.780
in.

104
00:05:48.440 --> 00:05:50.640
But in our case, if we find a

105
00:05:50.640 --> 00:05:54.380
file, we add that as user facts to

106
00:05:54.380 --> 00:05:56.900
this custom context provider.

107
00:05:59.660 --> 00:06:02.560
And then we overwrite two things here.

108
00:06:03.060 --> 00:06:07.780
We overwrite an invoking async and an invoked

109
00:06:07.780 --> 00:06:08.360
async.

110
00:06:08.360 --> 00:06:12.380
So invoking async is actually a call that

111
00:06:12.380 --> 00:06:15.840
it will do just before, when we run

112
00:06:15.840 --> 00:06:18.860
this, before it actually goes to the LLM,

113
00:06:19.280 --> 00:06:21.180
we can give further instructions.

114
00:06:21.780 --> 00:06:25.140
So if we set a breakpoint here and

115
00:06:25.140 --> 00:06:31.460
run again, and also set a breakpoint here

116
00:06:31.460 --> 00:06:33.180
in order to see it.

117
00:06:34.260 --> 00:06:38.100
So here, because when we new up the

118
00:06:38.100 --> 00:06:39.740
agent, it will go to this.

119
00:06:40.260 --> 00:06:41.880
We'll see our user ID is here.

120
00:06:42.600 --> 00:06:45.120
And it will go in and fetch the

121
00:06:45.120 --> 00:06:46.680
memory we have already.

122
00:06:47.480 --> 00:06:50.080
So our user facts is now up in

123
00:06:50.080 --> 00:06:51.560
memory for this agent.

124
00:06:55.220 --> 00:06:57.520
And then it's done creating the agent.

125
00:06:58.540 --> 00:07:03.900
And then when we ask a question, what

126
00:07:03.900 --> 00:07:06.560
is my name?

127
00:07:09.450 --> 00:07:15.130
It will happen this invoking async as part

128
00:07:15.130 --> 00:07:17.090
of the run async here.

129
00:07:17.270 --> 00:07:19.230
So this part has not happened yet.

130
00:07:20.070 --> 00:07:22.090
The AI have not been asked what is

131
00:07:22.090 --> 00:07:25.090
my name, but it's been given on the

132
00:07:25.090 --> 00:07:27.270
fly here some more instructions.

133
00:07:27.630 --> 00:07:28.990
And in my case, it's just a user

134
00:07:28.990 --> 00:07:29.350
facts.

135
00:07:29.350 --> 00:07:33.350
I string concatenate together, so it have these

136
00:07:33.350 --> 00:07:34.510
two strings.

137
00:07:35.390 --> 00:07:37.850
You could give it in XML, in JSON,

138
00:07:38.070 --> 00:07:39.050
whatever you want.

139
00:07:39.210 --> 00:07:41.690
I'm just giving it simple facts here.

140
00:07:44.060 --> 00:07:47.440
So whenever this happens, and let me also

141
00:07:47.440 --> 00:07:51.940
set a breakpoint down here, now the agent

142
00:07:51.940 --> 00:07:54.320
have actually responded back already.

143
00:07:54.660 --> 00:07:58.220
We can't see it yet, but just after

144
00:07:58.220 --> 00:08:01.540
it's finished doing its thing, and it releases

145
00:08:01.540 --> 00:08:04.500
us back to us here, we get the

146
00:08:04.500 --> 00:08:09.620
invoked, meaning now the request and the response

147
00:08:09.620 --> 00:08:10.360
have happened.

148
00:08:11.160 --> 00:08:12.560
And we can see that in the context

149
00:08:12.560 --> 00:08:15.160
here, where we can see that the request

150
00:08:15.160 --> 00:08:18.320
message was what is my name.

151
00:08:24.030 --> 00:08:28.070
And we can see in response messages.

152
00:08:29.450 --> 00:08:31.970
I'm pretty sure we can see those as

153
00:08:31.970 --> 00:08:32.130
well.

154
00:08:32.210 --> 00:08:33.570
Yeah, response messages are here.

155
00:08:33.570 --> 00:08:36.070
And it said, your name is Rasmus.

156
00:08:37.230 --> 00:08:39.490
So what we're doing is we're just extracting

157
00:08:39.490 --> 00:08:41.870
what it was the user actually said.

158
00:08:42.350 --> 00:08:43.910
In this case, what is my name?

159
00:08:45.070 --> 00:08:48.850
And we're going and run the agent that

160
00:08:48.850 --> 00:08:51.730
is our memory extractor.

161
00:08:51.890 --> 00:08:57.390
And we already know the following about the

162
00:08:57.390 --> 00:08:57.730
user.

163
00:08:58.030 --> 00:09:02.970
We know what their name is, and we

164
00:09:02.970 --> 00:09:04.570
know that their favourite colour is blue.

165
00:09:05.670 --> 00:09:08.090
But based on the message we have right

166
00:09:08.090 --> 00:09:11.650
now, what is my name, extract further memory

167
00:09:11.650 --> 00:09:14.810
either to add or to remove.

168
00:09:15.990 --> 00:09:18.990
In this case, it will come back with

169
00:09:18.990 --> 00:09:22.370
a response where there's nothing to add, nothing

170
00:09:22.370 --> 00:09:25.210
to remove, because our message is just what

171
00:09:25.210 --> 00:09:25.890
is my name.

172
00:09:27.390 --> 00:09:29.910
So it will do nothing special here.

173
00:09:29.910 --> 00:09:33.970
It will just update live the user's memory

174
00:09:33.970 --> 00:09:38.110
with what we already had, and we get

175
00:09:38.110 --> 00:09:39.370
our response back.

176
00:09:41.890 --> 00:09:44.670
But if I go into the agent now

177
00:09:44.670 --> 00:09:49.250
and ask it something more, it will give

178
00:09:49.250 --> 00:09:50.130
it some information.

179
00:09:50.370 --> 00:09:55.990
Again, I live in Denmark.

180
00:10:00.450 --> 00:10:02.830
Now the invoking will still go on, because

181
00:10:02.830 --> 00:10:05.170
it doesn't know if it needs to give

182
00:10:05.170 --> 00:10:06.770
the facts or not, so it will always

183
00:10:06.770 --> 00:10:11.660
be given up front, which is not important

184
00:10:11.660 --> 00:10:11.960
here.

185
00:10:12.440 --> 00:10:14.900
But the important thing here is that the

186
00:10:14.900 --> 00:10:18.920
message was, I live in Denmark, and the

187
00:10:18.920 --> 00:10:23.120
memory extractor will now see that, okay, based

188
00:10:23.120 --> 00:10:28.620
on that, I probably should make a memory

189
00:10:28.620 --> 00:10:31.500
update here that says, I need to add

190
00:10:31.500 --> 00:10:35.740
one memory, I live in Denmark, and nothing

191
00:10:35.740 --> 00:10:38.900
to remove, because I didn't ask to remove.

192
00:10:42.060 --> 00:10:43.860
So it now adds that memory.

193
00:10:45.460 --> 00:10:48.400
So we now have three memories in the

194
00:10:48.400 --> 00:10:50.600
file, and we can see that by just

195
00:10:50.600 --> 00:10:53.140
opening it and see we have three data.

196
00:10:56.720 --> 00:10:59.540
And it will just confirm that it has

197
00:10:59.540 --> 00:11:00.280
saved that.

198
00:11:01.840 --> 00:11:06.620
So when I say, or stop this and

199
00:11:06.620 --> 00:11:09.580
start it again, it now will get the

200
00:11:09.580 --> 00:11:10.480
tree information.

201
00:11:12.000 --> 00:11:14.620
In this case, it's going in and grabbing

202
00:11:14.620 --> 00:11:16.480
them now, and it will have the tree

203
00:11:16.480 --> 00:11:17.000
information.

204
00:11:19.140 --> 00:11:23.820
And when I say, what do you know

205
00:11:23.820 --> 00:11:25.700
about me?

206
00:11:28.100 --> 00:11:30.300
Again, it will go in on the fly,

207
00:11:30.440 --> 00:11:32.960
give the three user facts up front.

208
00:11:34.900 --> 00:11:37.400
And again, this could be done by making

209
00:11:37.400 --> 00:11:39.060
the agent and grabbing this.

210
00:11:39.140 --> 00:11:41.260
But this is a nice way of making

211
00:11:41.260 --> 00:11:43.840
something that works for every agent about the

212
00:11:43.840 --> 00:11:44.900
user facts, for example.

213
00:11:46.280 --> 00:11:48.920
And again, in this case, it will not

214
00:11:48.920 --> 00:11:53.960
grab anything new out of memory, because it

215
00:11:53.960 --> 00:11:57.960
was just a question and not something to

216
00:11:57.960 --> 00:11:59.080
quit remembering.

217
00:12:01.620 --> 00:12:04.840
So now it knows this, and I can

218
00:12:04.840 --> 00:12:10.360
say, forget my favourite colour.

219
00:12:18.300 --> 00:12:22.560
It will go in, extract that I want

220
00:12:22.560 --> 00:12:26.040
to remove, that my favourite colour is blue.

221
00:12:29.120 --> 00:12:31.220
Remove that from the list, so now only

222
00:12:31.220 --> 00:12:34.760
have the two things again, and come back

223
00:12:34.760 --> 00:12:35.260
to us.

224
00:12:37.060 --> 00:12:40.360
So again, all of this could have been

225
00:12:40.360 --> 00:12:44.080
done by giving instructions here again and again

226
00:12:44.080 --> 00:12:47.680
based on the facts, and after the response,

227
00:12:48.120 --> 00:12:51.140
the normal response down here, we could have

228
00:12:51.140 --> 00:12:52.640
extracted and so on.

229
00:12:53.040 --> 00:12:54.940
But this is just a nice way of

230
00:12:54.940 --> 00:12:57.920
putting it sort of in process.

231
00:12:58.420 --> 00:13:01.320
So now this custom thing can be used

232
00:13:01.320 --> 00:13:05.420
across multiple agents and just be reused, because

233
00:13:05.420 --> 00:13:08.320
now I've built a system that can store

234
00:13:08.320 --> 00:13:11.620
memories, and I can plug that into any

235
00:13:11.620 --> 00:13:14.480
agent across my system if I need to.

236
00:13:15.120 --> 00:13:17.140
Of course, I probably wouldn't save it in

237
00:13:17.140 --> 00:13:20.220
a temp folder, save it in a database,

238
00:13:20.360 --> 00:13:20.840
and so on.

239
00:13:21.800 --> 00:13:24.720
So this is what AI context can do.

240
00:13:24.840 --> 00:13:29.500
There's other AI contexts where Microsoft has built

241
00:13:29.500 --> 00:13:30.740
something by itself.

242
00:13:30.740 --> 00:13:33.900
They have a chat history memory that uses

243
00:13:33.900 --> 00:13:37.640
vector stores for this data, and they have

244
00:13:37.640 --> 00:13:41.060
a mem0 that does something similar.

245
00:13:41.860 --> 00:13:45.300
But I like this fact that you can

246
00:13:45.300 --> 00:13:47.900
control it yourself, how it can be done,

247
00:13:48.780 --> 00:13:52.360
that you can forget stuff again, or whatever

248
00:13:52.360 --> 00:13:54.380
you want to use this for, because it's

249
00:13:54.380 --> 00:13:59.580
essentially just before and after any LLM call

250
00:13:59.580 --> 00:14:00.700
do something.

251
00:14:02.940 --> 00:14:04.400
So that is actually everything.

252
00:14:04.900 --> 00:14:06.000
See you in the next one.
