WEBVTT

1
00:00:00.000 --> 00:00:05.000
Hi, and welcome to this AI and C-Sharp video on the Microsoft Agent Framework.

2
00:00:05.199 --> 00:00:10.199
Today we're going to make a review of the harness features

3
00:00:10.399 --> 00:00:13.800
that have just been released in Agent Framework.

4
00:00:14.000 --> 00:00:21.000
So let's get into a little about what it is and then see some code.

5
00:00:21.200 --> 00:00:24.799
So what is an agent harness?

6
00:00:25.799 --> 00:00:30.799
A harness is just a fancy word for taking an LLM model

7
00:00:31.000 --> 00:00:34.400
and putting instructions and tools on top of them.

8
00:00:34.599 --> 00:00:36.799
And especially an agent harness,

9
00:00:37.000 --> 00:00:41.200
you might know them from things like GitHub Copilot CLI.

10
00:00:41.400 --> 00:00:45.799
This is actually an agent harness where you have some instructions,

11
00:00:46.000 --> 00:00:50.000
meaning the system information on you need to write code,

12
00:00:50.200 --> 00:00:52.799
and you have some tools for reading files,

13
00:00:52.799 --> 00:00:55.200
writing files, finding files, and so on.

14
00:00:55.400 --> 00:00:58.400
And similarly, you could choose codecs,

15
00:00:58.599 --> 00:01:04.000
or you could use the GitHub Copilot inside Visual Studio Code.

16
00:01:04.199 --> 00:01:08.199
Cloud Code is an agent harness.

17
00:01:08.400 --> 00:01:16.000
And this is what the Agent Framework team have put into their system.

18
00:01:16.199 --> 00:01:22.199
So let's talk a little about the code and the three samples they have.

19
00:01:22.400 --> 00:01:24.599
Because I'm going to use their samples,

20
00:01:24.800 --> 00:01:27.000
because this is a lot of code,

21
00:01:27.199 --> 00:01:32.800
and I'm also a bit unsure why we need this sample in the first place.

22
00:01:33.000 --> 00:01:36.800
But let's talk a little about the code first.

23
00:01:37.000 --> 00:01:40.000
So inside, we can go in here,

24
00:01:40.199 --> 00:01:45.800
inside the Microsoft Agent Framework's own code,

25
00:01:46.000 --> 00:01:48.199
they also have a bunch of samples.

26
00:01:48.800 --> 00:01:52.800
So this is the thing that I'm looking at right now.

27
00:01:53.000 --> 00:01:56.199
So they have three samples, they have a research sample,

28
00:01:56.400 --> 00:02:02.000
they have some research with sub-agents that goes in and do some stock analysis,

29
00:02:02.199 --> 00:02:06.199
and some data processing of some sales data.

30
00:02:06.400 --> 00:02:09.399
So an open agent tool, and then two other tools.

31
00:02:09.600 --> 00:02:13.600
The second tool uses sub-agents.

32
00:02:13.800 --> 00:02:19.199
And besides those three, there's a bunch of extra code up here.

33
00:02:19.399 --> 00:02:24.800
And I mean a bunch, because inside these three is what is needed

34
00:02:25.000 --> 00:02:29.000
in order to build something like GitHub Copilot.

35
00:02:29.199 --> 00:02:33.199
Not as fancy, but of course it's just a sample.

36
00:02:33.399 --> 00:02:37.399
But there's over 4,000 lines of code up here.

37
00:02:38.399 --> 00:02:43.399
And I don't want to take and recreate that,

38
00:02:43.600 --> 00:02:47.600
so that's the reason why I'm going with their sample instead.

39
00:02:48.800 --> 00:02:52.800
Besides the sample data, which is the bulk of all this,

40
00:02:53.000 --> 00:02:58.000
there is some new things down in the Microsoft Agents.ai package,

41
00:02:58.199 --> 00:03:02.800
which is under a Harness namespace,

42
00:03:03.000 --> 00:03:07.000
where there's a bunch of new AI context providers.

43
00:03:07.199 --> 00:03:09.199
And a little middleware.

44
00:03:09.399 --> 00:03:13.800
So there's things for setting an agent mode, plan mode, or execution mode,

45
00:03:14.000 --> 00:03:17.000
like you might know in coding agents.

46
00:03:17.199 --> 00:03:20.199
There are some tools for file access,

47
00:03:20.399 --> 00:03:24.399
there's tools for memory and for storage of files.

48
00:03:24.600 --> 00:03:27.600
There's spinning up sub-agents.

49
00:03:27.800 --> 00:03:30.800
There's a to-do item, just like you see in GitHub Copilot,

50
00:03:31.000 --> 00:03:35.000
where it makes a, now I need to do this, now I need to do this.

51
00:03:35.399 --> 00:03:39.399
And then there's a tool for, hey, approve this tool,

52
00:03:39.600 --> 00:03:43.600
and don't ask me again, system.

53
00:03:45.000 --> 00:03:47.000
And then there's a new NuGet package itself,

54
00:03:47.199 --> 00:03:50.000
which is Microsoft Agents.ai Harness,

55
00:03:50.199 --> 00:03:52.800
where we have something called a Harness agent,

56
00:03:53.000 --> 00:03:55.399
and some options for that.

57
00:03:55.600 --> 00:04:00.000
And I'm finding it a bit strange that Harness is both in the raw package,

58
00:04:00.199 --> 00:04:02.399
and have its own package.

59
00:04:02.600 --> 00:04:07.000
So I don't really understand why this code is inside here,

60
00:04:07.199 --> 00:04:08.600
and not in the Harness package,

61
00:04:08.800 --> 00:04:12.199
because that's where it belongs, in my opinion.

62
00:04:12.399 --> 00:04:18.399
But I'm not having the insight into what this is about.

63
00:04:19.399 --> 00:04:22.399
But let's go in and see this.

64
00:04:24.000 --> 00:04:31.000
Because if we look into the research sample, which is the first one,

65
00:04:31.200 --> 00:04:37.200
what we can see is that we use just normal Azure OpenAI,

66
00:04:37.399 --> 00:04:40.799
and we use a model like always.

67
00:04:41.000 --> 00:04:44.200
And then we go in and set some token limits,

68
00:04:44.399 --> 00:04:47.399
because it can actually do quite a lot of tokens,

69
00:04:47.600 --> 00:04:51.000
and has long-running background work.

70
00:04:51.200 --> 00:04:54.600
And then a bunch of instructions on how to do plan mode,

71
00:04:54.799 --> 00:04:56.399
how to do execution mode.

72
00:04:56.600 --> 00:04:59.600
I'll let you see this on your own.

73
00:05:01.399 --> 00:05:03.399
And then we make an agent,

74
00:05:03.600 --> 00:05:08.600
and we'll just use OpenAI client against Azure.

75
00:05:08.799 --> 00:05:11.399
And we're going to use the responses client,

76
00:05:11.600 --> 00:05:17.000
and then we use the special one as AI chat client without stored output.

77
00:05:17.200 --> 00:05:25.000
This turns off that it's storing the conversations up in Azure OpenAI,

78
00:05:25.200 --> 00:05:29.799
and instead just keep it in memory.

79
00:05:30.000 --> 00:05:34.000
And then it has this new asHarnessAgent.

80
00:05:34.200 --> 00:05:38.000
And that might look like a fancy extra thing,

81
00:05:38.200 --> 00:05:42.200
but it's actually not too much what's happening inside this one.

82
00:05:43.200 --> 00:05:50.600
The only thing it really does is that it sets up the default compaction.

83
00:05:50.799 --> 00:05:54.399
So it's essentially a chat reducer that's built in.

84
00:05:54.600 --> 00:05:58.799
It sets up an in-memory chat history provider.

85
00:05:59.000 --> 00:06:01.200
And beyond that, it doesn't do too much.

86
00:06:01.399 --> 00:06:05.799
It does a little about function call invocation and message injection.

87
00:06:06.000 --> 00:06:10.200
So a bit of setup, but nothing major.

88
00:06:10.399 --> 00:06:13.399
You can see there's almost no code here in the agent on this.

89
00:06:14.799 --> 00:06:19.399
So a bit strange that it's even made, but it is.

90
00:06:19.600 --> 00:06:22.399
And that's what's part of the new package,

91
00:06:22.600 --> 00:06:24.799
together with the options,

92
00:06:25.000 --> 00:06:28.600
where you can more or less set the same thing as a normal agent.

93
00:06:28.799 --> 00:06:37.799
So I'm a bit puzzled why they did this, but that's okay.

94
00:06:39.000 --> 00:06:43.399
And then in this case, we add a to-do provider.

95
00:06:43.600 --> 00:06:46.799
So a to-do provider is just a set of tools

96
00:06:47.000 --> 00:06:51.000
for adding and removing and getting remaining to-dos

97
00:06:51.200 --> 00:06:53.799
and keeping track of those to-dos.

98
00:06:54.000 --> 00:06:57.000
So we could have quite easily made this ourselves

99
00:06:57.200 --> 00:07:00.200
with various tools and stuff,

100
00:07:00.399 --> 00:07:02.799
the modes for plan or execute,

101
00:07:03.000 --> 00:07:10.399
and some memory system that saves some memory down into the system.

102
00:07:11.399 --> 00:07:15.399
Then it gives access to a web search tool

103
00:07:15.600 --> 00:07:22.600
and a self-built search tool that just uses download UI.

104
00:07:23.600 --> 00:07:28.600
Then it takes this tool approval,

105
00:07:28.799 --> 00:07:32.799
so it can stop asking again and again.

106
00:07:33.000 --> 00:07:37.000
And then it sends it into all this extra code.

107
00:07:37.200 --> 00:07:40.200
And again, I won't go into this.

108
00:07:40.399 --> 00:07:46.399
Again, there's 4,000 lines of code that essentially builds up a UI.

109
00:07:47.399 --> 00:07:52.399
And that UI, if we run it here, looks like this.

110
00:07:53.799 --> 00:07:59.799
So we can write slash exit, we can write slash to-dos,

111
00:08:02.600 --> 00:08:06.600
and see we have no to-dos, and so on and so forth.

112
00:08:07.000 --> 00:08:10.000
And what we can do is we can, for example, say...

113
00:08:11.000 --> 00:08:14.000
So that's what topics should we research.

114
00:08:14.399 --> 00:08:18.399
First 10 people in space.

115
00:08:21.399 --> 00:08:24.399
And if we do this, it goes into a same mode

116
00:08:24.600 --> 00:08:28.000
as a copilot CLI or a codex,

117
00:08:28.200 --> 00:08:33.200
and begins to do various sets of work.

118
00:08:34.400 --> 00:08:38.400
So it's now calling that it needs to go into the right agent mode.

119
00:08:39.000 --> 00:08:41.599
It created three to-do items now,

120
00:08:41.599 --> 00:08:43.599
clarify the counts of people,

121
00:08:43.799 --> 00:08:45.799
verify 10 people,

122
00:08:46.000 --> 00:08:48.000
prepare concise thing,

123
00:08:48.200 --> 00:08:52.200
made a plan for that, so it used its fine memory tool.

124
00:08:53.200 --> 00:08:55.200
And now it has made that,

125
00:08:55.400 --> 00:08:59.400
and we are being asked a few things.

126
00:09:00.400 --> 00:09:02.400
Should we have it in chronological order,

127
00:09:02.599 --> 00:09:06.599
including suborbital flights, only orbital flights, or something?

128
00:09:06.799 --> 00:09:08.799
Say only orbital flights.

129
00:09:11.599 --> 00:09:16.400
So these are just instructions of asking for clarifying questions,

130
00:09:16.599 --> 00:09:21.599
and all the work around getting this UI to work.

131
00:09:22.200 --> 00:09:24.599
And it works in some places.

132
00:09:24.799 --> 00:09:28.599
Here you can see it just goes off the screen with the response,

133
00:09:28.799 --> 00:09:30.400
and not wrapping.

134
00:09:30.599 --> 00:09:34.599
But now asking, are we okay with this plan,

135
00:09:34.799 --> 00:09:37.799
and do we want to execute it?

136
00:09:38.000 --> 00:09:40.000
And let's execute it.

137
00:09:41.000 --> 00:09:44.000
So now it's into execution mode,

138
00:09:44.200 --> 00:09:47.200
reading the plan,

139
00:09:47.400 --> 00:09:49.400
getting the to-do items,

140
00:09:49.599 --> 00:09:52.599
do a bunch of web search calls now.

141
00:09:54.799 --> 00:09:58.799
So most of the code is actually building this GUI.

142
00:09:59.000 --> 00:10:04.000
And I will get a little later into why I think they built this.

143
00:10:05.599 --> 00:10:09.599
But it's going through the different work,

144
00:10:09.599 --> 00:10:13.599
and I've seen it sometimes stall, and I need to write continue,

145
00:10:13.799 --> 00:10:17.799
just like some of the coding agents as well.

146
00:10:18.000 --> 00:10:22.000
And it tries to keep track of how many tokens it uses.

147
00:10:22.200 --> 00:10:25.200
I don't think they are correct, those limits,

148
00:10:25.400 --> 00:10:27.400
but that's another thing.

149
00:10:27.599 --> 00:10:29.599
So it comes back with its answer,

150
00:10:29.799 --> 00:10:33.799
and who was the first in space, second in space, and so on,

151
00:10:33.799 --> 00:10:39.799
based on various sources that it found on the fly.

152
00:10:40.799 --> 00:10:46.799
So a little like GitHub Copilot or Cloud Code.

153
00:10:49.000 --> 00:10:53.000
The second demo is not as impressive,

154
00:10:53.200 --> 00:10:59.400
but let's go in and see it. So in that one,

155
00:11:00.000 --> 00:11:05.000
we're essentially just seeing that it adds a tool

156
00:11:05.200 --> 00:11:09.200
that is a web search tool.

157
00:11:09.799 --> 00:11:12.799
And in a parent agent,

158
00:11:13.000 --> 00:11:16.000
it gets that it can have sub-agents,

159
00:11:16.200 --> 00:11:22.200
which is just a way of spawning new agents on the fly, if need be.

160
00:11:24.200 --> 00:11:28.200
So if we start this sample,

161
00:11:28.599 --> 00:11:32.599
it needs to get some stock tickers,

162
00:11:32.799 --> 00:11:36.799
some Microsoft, and let's take Atlassian.

163
00:11:37.000 --> 00:11:41.000
It's easy to remember that it's called team.

164
00:11:41.200 --> 00:11:46.200
So now it will spin up two sub-agents, one for each of them,

165
00:11:46.400 --> 00:11:52.400
and do the work, finding the price,

166
00:11:52.599 --> 00:11:56.599
apparently at the end of last year.

167
00:11:57.599 --> 00:12:01.599
I've seen this fail a few times, not getting the right data, but...

168
00:12:05.799 --> 00:12:11.799
Again, there's all this extra code that keeps track of all the state,

169
00:12:12.000 --> 00:12:15.000
how far is everything in the...

170
00:12:15.200 --> 00:12:18.200
So it's a big, big orchestration task,

171
00:12:18.400 --> 00:12:22.400
those 4,000 lines of sample code.

172
00:12:23.000 --> 00:12:26.000
And yes, sometimes it just comes back and it couldn't do it.

173
00:12:26.200 --> 00:12:30.200
We can say try again, and sometimes it works.

174
00:12:36.000 --> 00:12:39.000
So it spins up two new sub-agents instead,

175
00:12:39.200 --> 00:12:43.200
and try to figure it out.

176
00:12:53.400 --> 00:12:57.400
So now it found it for one of them, but not the other.

177
00:12:57.599 --> 00:13:01.599
It depends on what's going on, and it's, of course,

178
00:13:01.799 --> 00:13:05.799
based on how good is the instructions, how good are the tools.

179
00:13:08.799 --> 00:13:13.799
The final sample is where there's some sales data

180
00:13:14.000 --> 00:13:19.000
of some various things that's been sold,

181
00:13:19.200 --> 00:13:22.200
and who's the salesperson.

182
00:13:22.400 --> 00:13:27.400
And similar to the other ones, it's just a data analysis

183
00:13:27.599 --> 00:13:30.599
that will now have access to various files,

184
00:13:30.799 --> 00:13:33.799
and being able to grab those data.

185
00:13:34.000 --> 00:13:44.000
Oh, that's not the right one. So let's say,

186
00:13:44.200 --> 00:13:48.200
what is the most sold item in quantity?

187
00:14:00.400 --> 00:14:05.400
This one I've seen throw back a lot of reasoning back,

188
00:14:06.400 --> 00:14:10.400
and suddenly in the middle of...

189
00:14:10.599 --> 00:14:14.599
The sentence just stopped, but let's see if it goes better this time.

190
00:14:14.799 --> 00:14:16.799
It actually went OK.

191
00:14:17.000 --> 00:14:20.000
So it found that the notebook pack is the one that sold the most.

192
00:14:20.200 --> 00:14:23.200
It did it wrong.

193
00:14:23.400 --> 00:14:27.400
Let's tell it that.

194
00:14:28.400 --> 00:14:32.400
Yeah, OK. It's still wrong.

195
00:14:45.400 --> 00:14:48.400
And it just goes off.

196
00:14:48.599 --> 00:14:51.599
Again, this is not the important part.

197
00:14:51.799 --> 00:14:56.799
The important part is that there is this harness around an HTML file,

198
00:14:56.799 --> 00:15:00.799
an agent that's not very smart here.

199
00:15:01.000 --> 00:15:05.000
That's another matter of opinion.

200
00:15:05.200 --> 00:15:10.200
But that's what we have in three samples.

201
00:15:12.599 --> 00:15:15.599
And my big question is why.

202
00:15:15.799 --> 00:15:22.000
Why in the world are we getting these demos? They're cool.

203
00:15:22.200 --> 00:15:25.200
It's cool to see a bigger demo,

204
00:15:25.200 --> 00:15:28.200
but it's not like I can take those 4,000 lines of code

205
00:15:28.400 --> 00:15:33.400
and just copy over and ship it to my customer or anything.

206
00:15:33.599 --> 00:15:38.599
Or if I do, I would be stupid and not use a tool like Codex instead,

207
00:15:38.799 --> 00:15:42.799
which is a dedicated tool to do it.

208
00:15:43.000 --> 00:15:49.000
I think they made this sample to show that it's possible to do.

209
00:15:49.200 --> 00:15:54.200
And I think it's made for some upcoming demo

210
00:15:54.599 --> 00:15:58.599
that they need to do, perhaps doing Microsoft Build in June.

211
00:15:58.799 --> 00:16:02.799
I'm recording this middle of May.

212
00:16:03.000 --> 00:16:06.000
I think this is there.

213
00:16:06.200 --> 00:16:11.200
Look how cool and how advanced you can build something

214
00:16:11.400 --> 00:16:14.400
in Agent Framework,

215
00:16:14.599 --> 00:16:18.599
because just showing a chat, just showing tools and so on,

216
00:16:18.799 --> 00:16:21.799
is not really any news.

217
00:16:21.799 --> 00:16:27.799
But showing that it's actually a very, very cool feature tool

218
00:16:28.000 --> 00:16:32.000
that you can build around.

219
00:16:32.799 --> 00:16:37.799
And I actually had this idea on my to-do list of videos as well.

220
00:16:38.000 --> 00:16:42.000
I wanted to build a coding agent myself as well,

221
00:16:42.200 --> 00:16:45.200
and this is what essentially they have done.

222
00:16:45.400 --> 00:16:50.400
And also shown that it's not that much about AI

223
00:16:50.799 --> 00:16:53.799
and much more about building a good GUI

224
00:16:54.000 --> 00:16:57.000
and building the right tools for an algorithm

225
00:16:57.200 --> 00:17:00.200
and keeping the orchestration right.

226
00:17:00.400 --> 00:17:06.400
And as we saw, it failed a bunch in good instructions and so on,

227
00:17:06.599 --> 00:17:10.599
so those demos will probably be iterated on

228
00:17:10.800 --> 00:17:12.800
until they need to do a demo on them.

229
00:17:13.000 --> 00:17:16.000
That's at least my thought process about this,

230
00:17:16.000 --> 00:17:23.000
because it's not really showing us how to use Agent Framework,

231
00:17:23.199 --> 00:17:26.199
not to use the various features.

232
00:17:26.400 --> 00:17:30.400
It's just they have built a bunch of tools just like we could.

233
00:17:30.599 --> 00:17:35.599
I had no doubt in my mind that it was possible

234
00:17:35.800 --> 00:17:39.599
to build something like GitHub Copilot using Agent Framework.

235
00:17:39.800 --> 00:17:41.800
Of course, it could be done,

236
00:17:42.000 --> 00:17:45.400
but that it's a bunch of extra code that needs to go into it.

237
00:17:45.400 --> 00:17:50.400
So again, in here,

238
00:17:50.599 --> 00:17:55.000
all the things that has nothing to do with the Agent Framework

239
00:17:55.199 --> 00:17:57.199
is 4,000 lines of code here,

240
00:17:57.400 --> 00:18:00.400
and then just a few lines of code down here

241
00:18:00.599 --> 00:18:04.599
for the various out-of-the-box tools and so on.

242
00:18:04.800 --> 00:18:09.800
So I don't really feel that it fits

243
00:18:10.000 --> 00:18:12.400
inside a Microsoft Agent Framework.

244
00:18:12.599 --> 00:18:15.199
I could see it if they just had an example

245
00:18:15.400 --> 00:18:19.000
and not had all of this and just kept it here,

246
00:18:19.199 --> 00:18:21.800
and had this kept over here.

247
00:18:22.000 --> 00:18:28.599
But why it's inside being maintained is a bit the thing,

248
00:18:28.800 --> 00:18:31.400
because if you really want to do file memory,

249
00:18:31.599 --> 00:18:33.599
you probably don't want to do file memory.

250
00:18:33.800 --> 00:18:40.800
You want to do memory in a database or a vector store or whatever.

251
00:18:41.000 --> 00:18:43.800
So why this is now inside?

252
00:18:44.000 --> 00:18:47.199
It's experimental, so they can take it away, of course,

253
00:18:47.400 --> 00:18:52.199
but it really puzzles me why we are seeing this.

254
00:18:52.400 --> 00:18:57.400
So I can't really see myself using this that much

255
00:18:57.599 --> 00:18:58.800
when there is dedicated tools.

256
00:18:59.000 --> 00:19:05.400
Had Codex, Cloud Code, and all these things not existed,

257
00:19:05.599 --> 00:19:11.400
of course, it would make sense for us to build them, but they do.

258
00:19:11.599 --> 00:19:15.400
But again, they have them.

259
00:19:15.599 --> 00:19:17.000
We can try them out.

260
00:19:17.199 --> 00:19:18.800
If you need to build something like this,

261
00:19:19.000 --> 00:19:21.599
you definitely need to go in and start exactly

262
00:19:21.800 --> 00:19:24.800
how they did the small little details.

263
00:19:25.000 --> 00:19:29.800
But in normal AI that you're probably going to make

264
00:19:30.000 --> 00:19:33.400
with Agent Framework, this is not really something

265
00:19:33.599 --> 00:19:38.599
we can use for too much, in my opinion at least.

266
00:19:39.400 --> 00:19:44.000
But if you think differently, please let me know in the comments.

267
00:19:44.199 --> 00:19:46.800
But that's everything for me this time.

268
00:19:47.000 --> 00:19:49.000
See you in the next one.