WEBVTT

00:00.160 --> 00:00.920
Hey there.

00:00.960 --> 00:01.720
Eden here.

00:01.760 --> 00:03.280
Remember this diagram?

00:03.560 --> 00:10.960
So in this video we're going to be integrating tool calling into our react agent architecture.

00:11.280 --> 00:17.560
So instead of leveraging React Prompt we're going to be leveraging tool calling capabilities of large

00:17.560 --> 00:18.520
language models.

00:18.520 --> 00:22.720
And we're going to see how it's going to make our agent much more reliable.

00:22.720 --> 00:25.360
And it's going to save us a lot of boilerplate code.

00:25.560 --> 00:29.960
And it's a much better evolution of this react agent.

00:30.160 --> 00:36.400
Now, even though we are going to be using function calling and not using react from, this is still

00:36.440 --> 00:43.040
referred to as react agent because the main algorithm here leverages reasoning and acting.

00:43.080 --> 00:48.400
However, this time the reasoning is shifted into function calling instead of react prompt.

00:48.440 --> 00:55.080
So my goal is that by the end of this video, you'll really understand why function calling is more

00:55.080 --> 00:59.840
useful and why it's now the go to when implementing AI agents.

01:00.400 --> 01:05.410
So for this video I opened cursor IDE with our project here.

01:05.410 --> 01:09.970
So here we have the react algorithm which we implemented in previous videos.

01:10.490 --> 01:20.330
And what I'm going to do is ask cursor to replace the react prompt and use and leverage the tool calling

01:20.330 --> 01:22.850
capabilities of modern llms.

01:22.890 --> 01:27.730
And in case you're wondering, I have a couple of reasons for coding this video.

01:28.170 --> 01:30.610
The first reason is that I am lazy.

01:30.890 --> 01:35.410
Just kidding, I am really lazy, but this is not the reason why I'm coding it.

01:35.530 --> 01:42.490
So the reason why I want to code it is that I want to show you the latest and greatest tech tools to

01:42.530 --> 01:48.450
show you the diff, because cursor is going to output us a very beautiful diff where we can see what

01:48.450 --> 01:49.850
exactly the changes were.

01:50.010 --> 01:56.530
And by doing so, we can really see how simpler tool calling is and a tool calling agent, comparing

01:56.530 --> 02:00.650
it to an original react agent with the react prompt.

02:00.690 --> 02:05.820
So this is the main reason why I do want to code it so we can see the diff.

02:05.820 --> 02:11.180
And of course I will be attaching the code in the repository by the end of this video, so you can find

02:11.180 --> 02:12.700
it in the course's resources.

02:12.860 --> 02:15.260
So if you want you can simply copy and paste it.

02:15.260 --> 02:21.300
Or if you have an AI coding editor like Cursor or Cloud Code, you can write a prompt like I'm writing

02:21.300 --> 02:23.940
or a similar prompt, and it should work as fine.

02:24.140 --> 02:27.620
All right, so let me go and write the prompt to cursor.

02:28.260 --> 02:35.340
I want you to use now dot bind tools instead of the react prompt.

02:37.740 --> 02:43.580
So let me now go and fast forward it because it is going to run for a while here.

02:45.580 --> 02:49.020
So cursor now is going to read all our files.

02:49.020 --> 02:53.580
It's going to do some rag by the way, and we're going to talk about it in the course in the rest of

02:53.580 --> 02:54.060
the course.

02:54.380 --> 02:57.260
And let's start to take a look at the diff here.

02:57.580 --> 03:00.980
So we can see that it removed the union typing.

03:00.980 --> 03:02.460
This is not that interesting.

03:02.700 --> 03:07.830
But what is interesting it removed the format log to STR here.

03:07.870 --> 03:13.870
Now the reason why I did that because with function calling we don't really get the reasoning.

03:13.870 --> 03:20.190
So we don't really get why the agent decided which tool to call, which is a trade off, but it's totally

03:20.190 --> 03:24.110
worth it and it's much more reliable using function calling.

03:24.550 --> 03:32.470
Next, it removed the react single input output parser, and I remind you this class is implementing

03:32.510 --> 03:34.670
a bunch of regular expressions.

03:35.630 --> 03:42.990
So with the function calling features of large language models, we don't need to parse the text generation

03:42.990 --> 03:48.390
of the LM to look with regular expression for the action.

03:48.550 --> 03:56.150
Because we get the tool to invoke, we get it in a special key in the response, which is called tool

03:56.150 --> 03:56.590
calls.

03:56.630 --> 04:01.990
And in that key the value is going to be a beautiful JSON, which is very easy to access.

04:02.190 --> 04:04.350
And we're going to see everything very, very soon.

04:04.350 --> 04:06.310
So please, please don't worry about it.

04:06.630 --> 04:14.630
Now we also can remove the agent action class and the agent class, which helped us to organize our

04:14.830 --> 04:20.910
code and help us to distinguish whether we need to continue to perform a tool call and to actually run

04:20.910 --> 04:23.550
the code, or we need to finish here.

04:23.550 --> 04:28.550
So this is in case we find this agent finish string again with regular expression here.

04:28.550 --> 04:36.150
So the reason why we do not need those um indicators, whether to execute a tool call or whether to

04:36.190 --> 04:42.470
finish and output the response of the LM is because we are going to be deriving this from the tool calls

04:42.470 --> 04:44.510
argument from the LM response.

04:44.830 --> 04:52.030
So if there are tool calls in that special key argument in the LM response, then we need to perform

04:52.030 --> 04:56.270
the tool call because it's going to have the information about the tool call and if it's empty.

04:56.310 --> 05:00.950
So this means that we can simply return the response of the LM to the user.

05:00.950 --> 05:06.030
And again once you see everything running and once you'll be seeing the traces, everything will be

05:06.030 --> 05:06.790
much clearer.

05:07.110 --> 05:07.790
Alrighty.

05:07.790 --> 05:14.200
So now we can also remove the prompt template and the prompt template used to hold the react prompt,

05:14.240 --> 05:16.760
leveraging the LM as the reasoning engine.

05:16.760 --> 05:19.400
And, you know, writing this prompt.

05:19.400 --> 05:25.400
So we don't need this anymore because what we're doing is shifting now the responsibility of selecting

05:25.400 --> 05:32.720
the correct tool to the LM provider, which is OpenAI, Google or anthropic for the LM vendor, which

05:32.720 --> 05:34.280
is implementing tool calling.

05:34.520 --> 05:34.960
All right.

05:34.960 --> 05:42.440
Here we're also removing the render text description, which we used in order to populate the react

05:42.440 --> 05:44.360
prompt with the description of the tools.

05:44.520 --> 05:46.000
So we don't need this anymore.

05:47.200 --> 05:49.560
So now let's see the other divs.

05:49.960 --> 05:50.360
All right.

05:50.360 --> 05:52.880
Let me go down a bit in the code here.

05:55.400 --> 05:56.120
Cool.

05:56.120 --> 06:01.360
So we can see it changed the Hello World to Hello World with linking tools.

06:01.680 --> 06:06.120
And right now you can see that we're removing the react prompt.

06:06.400 --> 06:12.050
And this prompt has a very special place in my heart because it's really what started it all.

06:12.210 --> 06:14.850
And this was the beginning of the agent era.

06:15.170 --> 06:21.170
And by the way, you know, one of the most underrated joys of being a developer is deleting code.

06:21.450 --> 06:27.410
So there's just something about hitting that backspace and cleaning things up that gives you a dopamine

06:27.410 --> 06:27.930
spike.

06:28.570 --> 06:30.130
So it's very satisfying.

06:30.330 --> 06:30.850
All right.

06:30.850 --> 06:37.210
So now you can see we deleted the entire react prompt the template and the prompt itself.

06:37.570 --> 06:46.170
And notice that from the LM we also deleted the stop arguments because we used to rely on that to stop

06:46.170 --> 06:49.050
the LM from hallucinating tool calls.

06:49.210 --> 06:53.930
And we really relied on the LM adhering to the react prompt for that.

06:54.050 --> 06:58.250
So it was also a bit problematic and a little bit flaky.

06:58.450 --> 07:02.730
So we are now removing it because with two calls we don't need this.

07:02.770 --> 07:06.250
The tool simply is going to be in the tool calls key.

07:06.730 --> 07:08.970
Let's go now to the rest of the file.

07:08.970 --> 07:16.220
And we can see we are here removing the intermediate steps which held all of the observations, which

07:16.220 --> 07:19.300
held all of the result of the tool executions.

07:19.300 --> 07:21.820
And that was our agent scratchpad.

07:22.140 --> 07:28.820
So we don't really need this now because we're going to be relying on something which is called a tool

07:28.820 --> 07:31.500
message, which encompasses exactly that.

07:31.540 --> 07:34.020
It holds the result of a tool execution.

07:34.260 --> 07:42.020
And LMS these days are more than capable and are actually fine tuned in order to be able to digest this

07:42.020 --> 07:43.020
very, very well.

07:43.020 --> 07:45.260
And I'll be showing you it very, very soon.

07:45.260 --> 07:52.220
When we run everything now, we also delete the agent variable, which was a runnable instance, which

07:52.220 --> 07:55.540
was the react prompt piped to the LM.

07:55.540 --> 08:03.980
And then we use the output parser in order to parse the LMS response and use that as a reasoning engine.

08:03.980 --> 08:05.580
So we don't need all of this.

08:05.620 --> 08:10.820
We can simply invoke our query, and the LM is going to respond with the tool calls.

08:10.820 --> 08:13.380
So this is now going to be deleted.

08:13.750 --> 08:14.270
All right.

08:14.270 --> 08:16.990
So now let's talk about the logic here.

08:17.430 --> 08:23.310
So first we want now to bind the LM with our tools here.

08:23.390 --> 08:28.030
So for that we're using now LM bind the built in method.

08:28.030 --> 08:30.270
And here we're going to provide the link chain tools.

08:30.270 --> 08:31.630
We want to give it here.

08:31.630 --> 08:37.710
So this is going to give the LM all of the descriptions and all of the interfaces of the tool we want

08:37.710 --> 08:39.270
to equip our LM with.

08:39.470 --> 08:47.390
And now the LM is going to be able to make the decision of making a tool call to those tools.

08:47.430 --> 08:53.790
Now I remind you, the LM is going to decide only when and with which arguments we need to invoke these

08:53.790 --> 08:54.150
tools.

08:54.150 --> 08:55.750
It doesn't really invoke it.

08:55.790 --> 08:57.630
We invoke it in our back end.

08:57.670 --> 08:59.670
So we need to do it in our application layer.

09:00.150 --> 09:00.910
Anyways.

09:00.950 --> 09:03.910
What the bind tools behind the scenes is going to do.

09:03.950 --> 09:10.550
It's going to take the tool description and the tool arguments and all the metadata on the tool and

09:10.550 --> 09:13.230
every request we're going to be making to the LM.

09:13.430 --> 09:16.800
Then link chain is going to append that information to the request.

09:16.800 --> 09:21.840
So then the LLM is going to be able to make the decision whether to call this tool or not.

09:22.040 --> 09:22.520
All right.

09:22.520 --> 09:28.520
So here in the messages variable we're going to put a list here with the human message with the input

09:28.560 --> 09:28.920
here.

09:29.160 --> 09:34.800
And by the way these messages variable here this list this is going to be our agent scratchpad.

09:34.840 --> 09:38.600
We're going to be appending to this list after every iteration of the agent.

09:38.880 --> 09:39.520
All right.

09:39.520 --> 09:43.640
So similar for before we're going to have a while loop.

09:43.800 --> 09:45.680
So it's going to continuously run.

09:45.880 --> 09:48.920
And we're going to be sending these messages list.

09:48.960 --> 09:54.240
Now for the first iteration it's going to be a list with one value which is going to be a human message.

09:54.240 --> 09:56.880
So this is what's going to start our agent work.

09:56.880 --> 10:01.800
So once we invoke now the LLM when we have tool calling.

10:01.800 --> 10:04.440
And let me focus now on this edit code here.

10:04.640 --> 10:06.760
So we're going to get back a result.

10:07.040 --> 10:12.680
And in the result we're going to have a very special field which is going to be tool calls.

10:12.880 --> 10:16.160
And if that field is going to be non-empty.

10:16.160 --> 10:19.290
Empty, so it's going to have an object in it.

10:19.330 --> 10:20.770
Then we have a tool call.

10:20.770 --> 10:23.010
So this is what we're seeing in this F here.

10:23.010 --> 10:26.690
And if we have a tool call then we need to execute a tool here.

10:26.690 --> 10:28.650
And notice this is a list.

10:28.650 --> 10:35.250
So this means there can be multiple tool calls and llms these days support parallel tool calling.

10:35.250 --> 10:37.490
So this is also possible.

10:37.490 --> 10:41.650
So this line over here simply extracts from the iMessage.

10:41.650 --> 10:42.810
If there is a tool call.

10:42.930 --> 10:46.330
And if not it's going to give us an empty list.

10:46.330 --> 10:51.930
So tool calls is going to be either a list with the tool calls needs to be executed or an empty list

10:51.930 --> 10:53.370
in case they shouldn't.

10:53.570 --> 11:00.370
And you can probably get the point that if there are no tool calls, then we simply need to finish and

11:00.410 --> 11:01.650
to break the while loop.

11:01.810 --> 11:04.090
However, if there are tool calls.

11:04.090 --> 11:11.330
So if this list has more than zero items in it, so at least one, then first we want to append everything

11:11.370 --> 11:12.330
to the history.

11:12.330 --> 11:18.620
So this is going to be to append the decision of the LLM to make the function calls to make the tool

11:18.660 --> 11:19.060
calls.

11:19.100 --> 11:22.380
So this is appending the reasoning to the scratchpad.

11:22.420 --> 11:24.220
So this is what we're doing over here.

11:24.220 --> 11:29.300
And by the way the concept of appending everything to the messages key.

11:29.460 --> 11:32.260
So we are going to hold a bunch of messages.

11:32.260 --> 11:35.500
And we're going to send those messages every time to the LM.

11:35.740 --> 11:41.100
So this is an interesting prompt engineering or context engineering strategy here.

11:41.100 --> 11:48.220
And it's actually proving itself over the years very, very useful because the LM is going to derive

11:48.260 --> 11:52.620
its reasoning logic and its trajectory through these messages as it is.

11:52.660 --> 11:55.060
So we don't need to have anything special here.

11:55.100 --> 12:01.740
And I remind you, when we get a message back from the LM, if there is a tool call, then we're going

12:01.740 --> 12:08.340
to see all the information about the tool call in the tool call argument in the AI response.

12:08.660 --> 12:12.220
So it's going to be a list containing all of the tool calls.

12:12.420 --> 12:18.500
Now each element of this list is a dictionary containing the information about the tool that needs to

12:18.500 --> 12:20.460
be invoked and executed.

12:20.460 --> 12:26.580
And what we do here is to simply iterate over that list, and in our case, is going to be only one

12:26.580 --> 12:27.380
element.

12:27.380 --> 12:32.460
And we're going to extract the tool name, we're going to extract the arguments for the tool, which

12:32.460 --> 12:34.300
is going to be a dictionary as well.

12:34.500 --> 12:36.220
And the tool called ID.

12:36.620 --> 12:43.860
Now this is way, way easier than what we did in the react algorithm, because now we don't need to

12:43.900 --> 12:51.020
rely on link chains ability to be able to parse well the LM output with regular expressions.

12:51.260 --> 12:53.740
So here we simply access it.

12:53.740 --> 12:58.140
And by the way in the link chain tool calling agent, this is exactly what it does.

12:58.140 --> 13:00.700
This is the output parsing of the LM response.

13:00.700 --> 13:05.860
This is simply accessing the relevant fields on the tool execution.

13:05.860 --> 13:12.780
So once we execute this part of the code, we have all of the information of which tool we want to use

13:12.780 --> 13:14.260
and we want to invoke.

13:15.020 --> 13:23.070
And now all we need to do is to use defined tool by name function that we implemented herself and we

13:23.110 --> 13:29.070
send a list of tools and we give it the tool name, and we get back the tool that needs to be executed.

13:29.230 --> 13:35.190
And we simply execute and invoke these tools with the arguments with which we extracted.

13:35.390 --> 13:37.990
And we have the observation.

13:38.030 --> 13:38.510
Right.

13:38.750 --> 13:44.830
And once we have the observation, we want to append to the message history a tool message containing

13:44.830 --> 13:51.710
the result of the tool execution, along with the tool call ID and the tool ID is super important because

13:51.710 --> 13:55.550
it helps us and the LM and this is the keyword.

13:55.590 --> 14:03.470
And to match the result of the tool execution back to the correct function call that the LM has made.

14:03.830 --> 14:10.630
Now, this is especially useful when the LM is going to be making parallel tool calling.

14:10.670 --> 14:12.430
At one API call.

14:12.710 --> 14:14.830
You can really see the importance of it.

14:14.870 --> 14:23.600
In this example, where we send to the LM one call with a complex query and it has Multipole in parallel

14:23.640 --> 14:24.400
tool calls.

14:24.560 --> 14:24.960
All right.

14:24.960 --> 14:25.880
Let's continue.

14:26.000 --> 14:34.040
So once we append the tool result as a tool message to the message list, we then continue the iteration

14:34.040 --> 14:36.680
in our while loop and start all over again.

14:36.880 --> 14:38.720
So now this is the input.

14:38.760 --> 14:46.160
The LM is going to receive the entire message list containing the user input, the agent trajectory,

14:46.400 --> 14:48.800
and the result of the tool executions.

14:48.960 --> 14:52.960
And now the let him knows that it needs to return a response.

14:53.160 --> 14:54.840
So the answer is going to give.

14:54.880 --> 14:59.560
It's not going to be an answer containing a tool call, but it's going to be the final answer.

14:59.680 --> 15:04.360
So now in the while loop we see that the number of tool calls is zero.

15:04.480 --> 15:07.560
So we do not execute this if clause here.

15:07.720 --> 15:15.320
And we continue to this part where we simply print the content of the message and break this entire

15:15.360 --> 15:16.160
while loop.