WEBVTT

1
00:00:00.000 --> 00:00:05.000
Hi, and welcome to this ARN CSTR video on the Microsoft Agent Framework.

2
00:00:05.500 --> 00:00:11.000
In this video, we're going to look into the Kodak and Hyperlite integration.

3
00:00:12.000 --> 00:00:16.500
So, you might ask, what is Kodak? What is Hyperlite?

4
00:00:17.500 --> 00:00:22.500
Well, it has also to do with the tool calling,

5
00:00:23.000 --> 00:00:25.000
and tool calling in a different way.

6
00:00:26.000 --> 00:00:31.500
Because when you have traditional tool calling in a LLM,

7
00:00:32.000 --> 00:00:36.000
we, of course, have our prompt and our tool definitions,

8
00:00:36.500 --> 00:00:40.500
and then we ask the LLM to do some kind of task in that prompt.

9
00:00:41.000 --> 00:00:45.000
And it might need to use three different tools in order to do that.

10
00:00:45.500 --> 00:00:51.000
And the way tool calling is working is, of course, that the LLM gets the request

11
00:00:51.000 --> 00:00:54.000
and tells back how to call the tool,

12
00:00:54.500 --> 00:00:58.000
then get back, call the tool, get back, call the tool.

13
00:00:58.000 --> 00:01:00.500
So, we're not only doing one request and response,

14
00:01:00.500 --> 00:01:05.000
but for every single tool call, we're doing multiple requests.

15
00:01:05.500 --> 00:01:10.000
And that means that we are wasting tokens and time

16
00:01:10.000 --> 00:01:13.000
because there is this back and forward all the time.

17
00:01:14.500 --> 00:01:20.000
Kodak is yet another way of trying to do something similar,

18
00:01:20.500 --> 00:01:23.500
but instead of sending all these different tool calls

19
00:01:23.500 --> 00:01:28.500
and giving it all the definitions, which can be token costly,

20
00:01:29.000 --> 00:01:32.000
Kodak wants to do one tool.

21
00:01:32.000 --> 00:01:35.500
They want to do one tool that can execute code.

22
00:01:36.500 --> 00:01:40.500
So, instead of all our tools in the definition,

23
00:01:41.000 --> 00:01:46.000
we just give an execute code tool.

24
00:01:47.000 --> 00:01:52.000
The LLM gets that and builds up a piece of code

25
00:01:52.500 --> 00:01:55.500
to do the same task as we do up here,

26
00:01:57.000 --> 00:01:59.000
but it only has this tool,

27
00:01:59.500 --> 00:02:05.000
and it knows that that tool has various underlying tools to do this.

28
00:02:05.500 --> 00:02:11.500
So, we still do normal tool code in C Sharp and so on,

29
00:02:12.000 --> 00:02:16.000
but it just happens to be now that it's a piece of code

30
00:02:16.000 --> 00:02:21.500
that executes those tools instead of it's the LLM coming back and saying,

31
00:02:21.500 --> 00:02:24.000
I need to call this, I need to call this.

32
00:02:24.000 --> 00:02:28.000
So, it's a more orchestrated one call to the LLM,

33
00:02:28.000 --> 00:02:34.500
get back how would I do this in these three steps of calling these tools.

34
00:02:35.000 --> 00:02:39.500
The execution tool then executes the code,

35
00:02:40.000 --> 00:02:43.500
gives that back to the LLM, and then get the result.

36
00:02:43.500 --> 00:02:46.000
So, we save some tool calls,

37
00:02:46.000 --> 00:02:49.500
and we have a more orchestrated way of doing this.

38
00:02:51.000 --> 00:02:54.000
But, of course, since we are executing code now,

39
00:02:54.500 --> 00:03:00.500
it is a bit insecure because we need to have that code run in a sandbox.

40
00:03:00.500 --> 00:03:05.500
So, we have seen code interpreter videos in the past

41
00:03:05.500 --> 00:03:14.500
where we let the LLM use Python code up in the cloud

42
00:03:14.500 --> 00:03:16.500
if you use the responses API.

43
00:03:18.500 --> 00:03:22.500
But, of course, that is costly because someone needs to run that code.

44
00:03:23.000 --> 00:03:27.500
This code can run locally, but again, since it's insecure,

45
00:03:27.500 --> 00:03:30.500
we need some way to sandbox that code.

46
00:03:31.500 --> 00:03:34.000
And that is where Hyperlite part comes in.

47
00:03:34.000 --> 00:03:37.500
So, Hyperlite is a set of NuGet packages

48
00:03:37.500 --> 00:03:43.500
that allows for code execution in sandbox,

49
00:03:43.500 --> 00:03:46.500
not using C Sharp, that's not part of the spec,

50
00:03:46.500 --> 00:03:49.000
but it can use JavaScript or Python.

51
00:03:50.000 --> 00:03:52.500
So, this is the setup.

52
00:03:52.500 --> 00:03:55.500
Let's see a few examples.

53
00:03:55.500 --> 00:03:59.500
So, in my sample repo, I have something called Code Act here,

54
00:04:00.500 --> 00:04:02.500
and in that I put in an introduction.

55
00:04:02.500 --> 00:04:05.500
We might make more videos going forward.

56
00:04:06.500 --> 00:04:10.500
But let's just set a breakpoint, and let's run the code here.

57
00:04:15.500 --> 00:04:18.500
So, it's just getting started.

58
00:04:19.500 --> 00:04:22.500
And let me move this a bit.

59
00:04:22.500 --> 00:04:24.000
Getting started.

60
00:04:24.000 --> 00:04:28.500
And let me move the console in here as well so we can see.

61
00:04:31.000 --> 00:04:32.000
There we go.

62
00:04:32.000 --> 00:04:36.000
So, we are just nearing up a new client, like always.

63
00:04:36.000 --> 00:04:40.500
In this case, we are just using JTP 5.4 Mini.

64
00:04:41.500 --> 00:04:45.500
And then let's try to use Code Act with JavaScript,

65
00:04:45.500 --> 00:04:49.000
meaning Hyperlite JavaScript,

66
00:04:49.000 --> 00:04:54.000
because Code Act is just the way of executing the code,

67
00:04:54.000 --> 00:04:58.000
and the JavaScript is the sandboxed environment.

68
00:04:59.000 --> 00:05:02.000
So, we're going to make an agent like normal,

69
00:05:02.000 --> 00:05:08.000
but what we're going to do is we're getting one AI contact provider that is new,

70
00:05:08.000 --> 00:05:12.000
and this package, you need to go in and have

71
00:05:12.000 --> 00:05:16.500
the new Microsoft Agents AI Hyperlite NuGet package

72
00:05:16.500 --> 00:05:19.500
in order to be able to react to this.

73
00:05:22.000 --> 00:05:24.500
So, we're simply just saying,

74
00:05:24.500 --> 00:05:28.500
always use execute code for calculations,

75
00:05:28.500 --> 00:05:31.500
write the code in JavaScript, show the result,

76
00:05:31.500 --> 00:05:35.500
and in my case, I've also said, show the code you used to do it.

77
00:05:37.000 --> 00:05:40.000
And then we have our question here.

78
00:05:40.000 --> 00:05:41.500
Let's get rid of this.

79
00:05:42.000 --> 00:05:47.000
We want to calculate the 20th Fibonacci number,

80
00:05:47.000 --> 00:05:51.000
divide it by 42, and subtract by 30,

81
00:05:51.000 --> 00:05:56.000
just so it has a little extra thing to do than just Fibonacci.

82
00:05:56.000 --> 00:06:01.500
And in some cases, this can actually be done by an LLM without any tools,

83
00:06:01.500 --> 00:06:03.500
but that's not the point here.

84
00:06:04.500 --> 00:06:10.500
But we make our agent like normal, and we call the agent.

85
00:06:12.000 --> 00:06:15.000
And what we'll see once it's done

86
00:06:19.500 --> 00:06:24.500
is that it ran,

87
00:06:24.500 --> 00:06:28.500
and it did a function call where it said execute code,

88
00:06:28.500 --> 00:06:32.500
and then it sent in what code it wanted to execute,

89
00:06:33.500 --> 00:06:38.000
and it got a result back, and we can see the standard out here

90
00:06:38.000 --> 00:06:41.000
is the number that we need over here.

91
00:06:41.000 --> 00:06:44.000
Let's pack it into some JSON.

92
00:06:45.500 --> 00:06:48.500
So what we can see over here is that it figured out

93
00:06:48.500 --> 00:06:53.000
what the 20th Fibonacci number was,

94
00:06:53.000 --> 00:06:57.000
made the calculation, and it gave us back the right result.

95
00:06:58.000 --> 00:07:00.500
And this was not just generated by the LLM.

96
00:07:00.500 --> 00:07:05.000
It actually executed this piece of JavaScript in order to do it,

97
00:07:05.500 --> 00:07:10.000
where it made some JavaScript code

98
00:07:10.000 --> 00:07:17.000
and did divide by 42 and minus 30 in order to log out what that was,

99
00:07:17.000 --> 00:07:20.500
and that was what we saw in the output that it then used.

100
00:07:22.000 --> 00:07:27.000
So this is just like code interpreter up in the cloud,

101
00:07:27.000 --> 00:07:29.000
but you can do it locally now.

102
00:07:30.000 --> 00:07:33.000
Cool. But we talked about tools,

103
00:07:33.000 --> 00:07:36.500
so let's see a little more with tools here.

104
00:07:37.500 --> 00:07:43.500
So we do the same as the code act here,

105
00:07:43.500 --> 00:07:48.000
but because we want to add tools, we are just making some options here.

106
00:07:48.500 --> 00:07:52.500
So we are making the options, and we want to use JavaScript,

107
00:07:52.500 --> 00:07:55.000
and we then give it a tool.

108
00:07:55.500 --> 00:07:59.500
And this tool is just like we have done so many times before.

109
00:07:59.500 --> 00:08:03.500
It's sunny and 90 degrees, so I can even set a breakpoint here.

110
00:08:05.000 --> 00:08:08.000
And of course, we would have multiple tools.

111
00:08:08.000 --> 00:08:13.500
Just having one tool over the other might not be the smart thing,

112
00:08:13.500 --> 00:08:18.500
but in this case, we are just getting it that way.

113
00:08:19.500 --> 00:08:22.500
Then we again make an agent,

114
00:08:22.500 --> 00:08:29.500
and we give our fiberlight code act provider with our options this time,

115
00:08:29.500 --> 00:08:35.000
so it actually knows that it has a tool inside it as well.

116
00:08:37.000 --> 00:08:40.500
So if we do it like that, we can, for example,

117
00:08:40.500 --> 00:08:42.500
ask, what is the weather like in Paris?

118
00:08:43.500 --> 00:08:48.500
And in normal scenarios, we would just have called that tool.

119
00:08:48.500 --> 00:08:50.500
Now you can see it's calling the tool,

120
00:08:50.500 --> 00:08:54.500
but let me show you what called the tool in one second.

121
00:08:57.500 --> 00:08:58.500
And...

122
00:09:01.000 --> 00:09:07.000
Oh, I'm running too far here, but let's just see the script here.

123
00:09:07.000 --> 00:09:11.000
So what it did was it came back, of course, with the value,

124
00:09:11.000 --> 00:09:16.500
but as you can see, it was a JavaScript call that made a constant city,

125
00:09:16.500 --> 00:09:21.500
and then it made a call tool, which is part of the code act specification,

126
00:09:22.500 --> 00:09:25.000
where it called the getWeatherForCity,

127
00:09:25.000 --> 00:09:27.500
which was the one that we hit down here.

128
00:09:27.500 --> 00:09:32.500
So our JavaScript actually called back,

129
00:09:32.500 --> 00:09:36.000
told the C Sharp that it needed to call that tool,

130
00:09:36.000 --> 00:09:38.500
and got the results back for us.

131
00:09:39.500 --> 00:09:43.000
And you can imagine if I had three tools,

132
00:09:43.000 --> 00:09:48.000
then it would have been within that call this, get the result,

133
00:09:48.000 --> 00:09:51.500
do this about it, get the result, do this about it,

134
00:09:51.500 --> 00:09:56.000
and finally produce one output instead of having the multiple tool calls.

135
00:09:58.500 --> 00:10:03.500
So that is the JavaScript part.

136
00:10:03.500 --> 00:10:06.500
Then it ran already this one,

137
00:10:06.500 --> 00:10:10.500
where you can see that if we want to not just do JavaScript,

138
00:10:10.500 --> 00:10:14.500
which is out of the box and default because it's very easy

139
00:10:14.500 --> 00:10:19.000
to make a sandbox for that without us needing to do anything special.

140
00:10:21.500 --> 00:10:25.500
If we want to use Python instead,

141
00:10:25.500 --> 00:10:28.500
we need to have some kind of Python sandbox,

142
00:10:28.500 --> 00:10:32.500
in this case an AOT, but it can also be a Wasm sandbox.

143
00:10:33.500 --> 00:10:37.500
And I didn't know how to make this, I'm not smart with Python,

144
00:10:37.500 --> 00:10:42.500
but I just had my codecs AI figured out,

145
00:10:42.500 --> 00:10:44.500
and it gave me this file.

146
00:10:44.500 --> 00:10:48.500
In my case, there's this extra file, it's 40 megabytes,

147
00:10:48.500 --> 00:10:52.500
so it's kind of big, and I can't guarantee that this works

148
00:10:52.500 --> 00:10:55.500
in every machine you have, so if it doesn't work

149
00:10:55.500 --> 00:10:59.500
when you run this, go in and check out.

150
00:11:00.500 --> 00:11:04.500
Just ask an AI, make this file for me.

151
00:11:06.500 --> 00:11:10.000
But it's essentially Python packet up in a sandbox,

152
00:11:10.000 --> 00:11:12.000
where we can run code.

153
00:11:12.000 --> 00:11:15.000
And then we do exactly the same as before,

154
00:11:15.000 --> 00:11:18.000
the only difference is that we're now using Python,

155
00:11:18.000 --> 00:11:21.500
and telling it to use Python to write the code.

156
00:11:21.500 --> 00:11:26.000
But it also did exactly the same, and we can see that over here,

157
00:11:26.000 --> 00:11:31.000
where it shows the Python it used instead of some JavaScript it used.

158
00:11:33.500 --> 00:11:38.500
And then we have, if we don't want to have these AI context providers,

159
00:11:38.500 --> 00:11:44.000
we can actually just get the raw tool and the raw code.

160
00:11:45.000 --> 00:11:51.500
So we do that by just making the same sandbox here,

161
00:11:51.500 --> 00:11:57.000
but then we can say Hyperlite execute code function with those options.

162
00:11:57.000 --> 00:12:01.000
And what that produces is actually a tool,

163
00:12:01.000 --> 00:12:05.000
so a tool like normal, where we have a name for it,

164
00:12:05.000 --> 00:12:11.500
and a description, and inside it will be able to do

165
00:12:11.500 --> 00:12:15.500
the call to the Hyperlite.

166
00:12:15.500 --> 00:12:19.500
And in order for it to understand what that is,

167
00:12:19.500 --> 00:12:23.500
you can also, from that tool, say, build instructions.

168
00:12:24.000 --> 00:12:27.000
So that tool can get back and say,

169
00:12:27.000 --> 00:12:38.500
and this is actually what happens inside the AI context provider.

170
00:12:38.500 --> 00:12:43.500
So let's quickly have a look at what the instructions it gives on the fly is.

171
00:12:43.500 --> 00:12:48.500
You can execute code in a secure sandbox

172
00:12:48.500 --> 00:12:50.500
by calling the execute tool.

173
00:12:50.500 --> 00:12:53.500
Any tools listed in the tools description

174
00:12:53.500 --> 00:12:58.500
are only accessible within the toolbox via the call tool,

175
00:12:58.500 --> 00:13:02.500
as we saw up here, the call tool here.

176
00:13:02.500 --> 00:13:04.500
They cannot be revoked directly,

177
00:13:04.500 --> 00:13:11.000
so it's not by mistake begin to call some tools that are not there.

178
00:13:11.000 --> 00:13:14.000
So this is the way we can do that,

179
00:13:14.000 --> 00:13:18.000
and again, the tool is up here,

180
00:13:18.000 --> 00:13:23.000
and I think it's here in the description.

181
00:13:25.500 --> 00:13:28.500
If there had been any tools extra we have given here,

182
00:13:28.500 --> 00:13:31.500
it would have been in this, that inside this tool,

183
00:13:31.500 --> 00:13:33.500
you can call this, this, and this.

184
00:13:35.500 --> 00:13:40.500
So that way, we can just put our instructions in manually,

185
00:13:40.500 --> 00:13:43.500
and we can put in the tools manually

186
00:13:43.500 --> 00:13:46.000
instead of using the AI context provider.

187
00:13:46.000 --> 00:13:51.000
Probably easier to do this and let it inject on the fly,

188
00:13:51.000 --> 00:13:54.000
but if you, for some reason, know up front

189
00:13:54.000 --> 00:13:57.000
that you need the tool, don't need the tool, and so on,

190
00:13:57.000 --> 00:14:01.000
it is possible to get them by tool calling.

191
00:14:03.000 --> 00:14:05.500
And then it will run like normal.

192
00:14:06.500 --> 00:14:10.500
So pretty nice feature

193
00:14:10.500 --> 00:14:15.500
that is able to be sandboxed

194
00:14:15.500 --> 00:14:18.000
and save you a lot of tool calling

195
00:14:18.000 --> 00:14:21.500
if you have a lot of chaining of tools.

196
00:14:23.000 --> 00:14:25.500
So that is actually everything.

197
00:14:25.500 --> 00:14:27.500
See you on the next one.