WEBVTT

00:00.080 --> 00:07.320
This slide introduces one of the most important capabilities in modern AI systems multi-step reasoning

00:07.320 --> 00:08.200
with tools.

00:08.680 --> 00:12.480
Many real world problems cannot be solved with a single action.

00:12.880 --> 00:19.160
They require breaking a task into steps, executing multiple operations across different systems, and

00:19.160 --> 00:24.680
adapting decisions based on real time feedback, as shown visually on page one.

00:24.840 --> 00:31.800
Multi-step reasoning enables intelligent systems to move beyond isolated tool calls and into coordinated

00:31.800 --> 00:32.640
workflows.

00:33.200 --> 00:37.320
Instead of answering questions, the AI actively works toward a goal.

00:37.720 --> 00:41.560
This is what separates simple automation from true AI agents.

00:41.840 --> 00:46.640
In this section, we focus on how large language models orchestrate these workflows.

00:47.120 --> 00:53.080
The model plans the steps, chooses which tools to call, observes the results, and decides what to

00:53.120 --> 00:53.840
do next.

00:54.520 --> 00:58.520
The system infrastructure executes the tools and enforces safety.

00:58.800 --> 01:03.010
The key takeaway is that intelligence alone is not enough.

01:03.530 --> 01:08.330
Multi-step systems require structure, control and resilience.

01:08.970 --> 01:15.490
This slide sets the foundation for understanding how agents operate reliably in production environments.

01:15.970 --> 01:22.250
This slide defines multi-step reasoning with tools and explains why it is essential.

01:22.810 --> 01:29.410
As described on page two, many tasks demand sequential decision making, where each step builds on

01:29.410 --> 01:30.410
the previous one.

01:31.050 --> 01:34.210
A single tool call cannot handle these scenarios.

01:34.770 --> 01:38.090
The language model must orchestrate the entire process.

01:38.770 --> 01:45.490
It starts by planning the steps, selecting appropriate tools, executing actions, observing the results,

01:45.490 --> 01:48.170
and adapting the plan based on what it learns.

01:49.090 --> 01:52.330
This creates a dynamic and feedback driven workflow.

01:52.930 --> 02:00.410
The diagram on this slide highlights the four core phases plan steps, execute tools, observe results,

02:00.570 --> 02:02.210
and decide the next action.

02:02.810 --> 02:05.490
This loop continues until the task is complete.

02:06.090 --> 02:08.690
The key insight is clearly stated on the slide.

02:09.050 --> 02:12.530
This capability is the foundation of AI agents.

02:13.170 --> 02:19.130
Agents are systems that can autonomously navigate complex tasks without constant human intervention.

02:19.610 --> 02:25.970
Multi-step reasoning is what allows AI to move from isolated actions to intelligent problem solving.

02:26.170 --> 02:29.570
This slide introduces the concept of tool chaining.

02:29.850 --> 02:35.290
Tool chaining means calling tools in sequence, where each step depends on the output of the previous

02:35.290 --> 02:35.690
one.

02:36.250 --> 02:41.970
As shown on page three, this creates a pipeline of operations that builds toward a final goal.

02:42.410 --> 02:49.530
The visual example demonstrates a common enterprise workflow fetching user data, validating it, updating

02:49.530 --> 02:52.010
records, and notifying stakeholders.

02:52.410 --> 02:55.930
Each step relies on the successful completion of the prior step.

02:56.530 --> 03:03.060
A critical distinction is emphasized at the bottom of the slide The LM acts as the planner and controller,

03:03.100 --> 03:04.340
not the executor.

03:04.860 --> 03:07.100
The model decides what to do and when.

03:07.300 --> 03:10.540
But your system infrastructure handles the actual execution.

03:10.900 --> 03:14.420
This separation is essential for security and reliability.

03:15.060 --> 03:19.820
The model reasons and plans while deterministic systems execute actions.

03:20.500 --> 03:26.820
Tool chaining transforms isolated function calls into meaningful workflows and is a core capability

03:26.860 --> 03:28.700
for building autonomous agents.

03:29.060 --> 03:35.820
This slide explains the architectural loop that powers tool chaining, as illustrated on page four.

03:36.180 --> 03:40.260
The system follows a repeating pattern until the goal is achieved.

03:40.900 --> 03:49.180
The loop consists of user input tool selection by the LM tool execution result observation, and decision

03:49.220 --> 03:50.700
making for the next step.

03:51.380 --> 03:58.340
The diagram visually maps this process into the familiar think, act, observe, decide cycle.

03:58.980 --> 04:05.070
Each iteration updates the system state, allowing the model to make increasingly informed decisions

04:05.070 --> 04:06.710
as the workflow progresses.

04:07.430 --> 04:11.550
A key architectural requirement highlighted here is state management.

04:12.190 --> 04:18.870
The system must preserve context across steps so that outputs from one tool can inform future actions.

04:19.270 --> 04:23.470
Without this, multi-step reasoning collapses into isolated calls.

04:24.150 --> 04:29.270
This architecture creates a feedback mechanism where each action informs the next decision.

04:29.830 --> 04:36.190
It is this feedback loop that enables adaptive behavior, making the system robust and dynamic environments

04:36.310 --> 04:38.310
where conditions change over time.

04:38.870 --> 04:45.390
This slide highlights a critical design principle the separation between planning and execution.

04:45.870 --> 04:51.910
As shown on page five, effective AI agent systems clearly distinguish these responsibilities.

04:52.470 --> 04:54.150
Planning is the model's role.

04:54.310 --> 04:58.310
It involves deciding what to do, breaking down complex tasks.

04:58.350 --> 04:59.430
Choosing tools.

04:59.670 --> 05:03.310
determining action sequences, and adapting based on feedback.

05:03.870 --> 05:07.230
Execution, however, is the system's responsibility.

05:07.550 --> 05:13.630
This includes calling APIs, performing data operations, handling retries, and returning structured

05:13.630 --> 05:14.390
results.

05:14.710 --> 05:18.030
The best practice emphasized here is non-negotiable.

05:18.190 --> 05:21.230
Never let the LM execute logic directly.

05:21.630 --> 05:28.190
Allowing models to perform execution introduces serious security, reliability, and control risks.

05:28.510 --> 05:34.950
By enforcing this separation, engineers maintain full control over system behavior while still benefiting

05:34.950 --> 05:37.030
from the model's reasoning capabilities.

05:37.630 --> 05:43.870
This design principle is foundational for safe, maintainable, and scalable AI agents.

05:44.350 --> 05:50.150
This slide focuses on observability, a requirement for trustworthy tool using systems.

05:50.510 --> 05:56.230
As stated clearly on page six, if you can't see what's happening inside your system, you can't trust

05:56.230 --> 05:56.550
it.

05:57.070 --> 06:03.560
Comprehensive logging transforms opaque AI workflows into transparent, auditable processes.

06:04.200 --> 06:11.240
The slide lists essential signals to log tool names, input parameters, responses, errors, and execution

06:11.240 --> 06:11.840
timing.

06:12.320 --> 06:18.440
Observability enables three critical capabilities first, debugging when things go wrong.

06:18.800 --> 06:21.960
Second, auditing for compliance and accountability.

06:22.320 --> 06:27.640
Third, performance tuning to identify latency bottlenecks and optimize workflows.

06:28.040 --> 06:32.320
Without observability, multi-step systems become impossible to debug.

06:32.680 --> 06:36.960
Failures feel random, trust erodes, and teams lose confidence.

06:37.360 --> 06:40.520
Logging is not optional, it is infrastructure.

06:41.080 --> 06:43.560
The rule on this slide is simple and powerful.

06:44.000 --> 06:46.680
Visibility is a prerequisite for trust.

06:46.920 --> 06:53.280
As workflows span multiple tools and services, tracing becomes essential.

06:53.800 --> 07:00.850
Page seven explains how tracing allows teams to follow a single Request through its entire journey.

07:01.410 --> 07:04.810
The slide introduces four core tracing elements.

07:05.050 --> 07:11.210
Request IDs, step numbers, correlation IDs, and complete chain visibility.

07:11.650 --> 07:16.450
Together, these enable end to end tracking across distributed systems.

07:16.970 --> 07:22.170
With proper tracing, engineers can understand why the agent did what it did.

07:22.810 --> 07:27.050
This transforms mysterious failures into actionable insights.

07:27.490 --> 07:31.570
Distributed tracing tools allow visualization of entire workflows.

07:31.890 --> 07:36.930
Latency analysis at each step and identification of cross service bottlenecks.

07:37.450 --> 07:39.770
Tracing is not just for debugging.

07:40.090 --> 07:46.610
It enables continuous improvement by revealing patterns in agent decision making without tracing.

07:46.650 --> 07:49.690
Multi-step agents operate as black boxes.

07:49.930 --> 07:54.610
Powerful but untrustworthy, this slide presents a reality check.

07:54.970 --> 07:59.620
Failures are inevitable in multi-step systems, as shown on page eight.

07:59.820 --> 08:02.260
Complexity creates many failure points.

08:02.740 --> 08:04.140
Invalid parameters.

08:04.300 --> 08:06.620
Partial execution timeouts.

08:06.660 --> 08:09.340
Inconsistent state and infinite loops.

08:09.660 --> 08:12.220
Each failure type represents a different risk.

08:12.740 --> 08:14.860
Invalid inputs can break tools.

08:15.380 --> 08:18.940
Partial execution can leave systems in an inconsistent state.

08:19.460 --> 08:21.740
Network failures introduce uncertainty.

08:22.300 --> 08:25.340
Infinite loops can consume resources indefinitely.

08:25.940 --> 08:27.660
The key message here is critical.

08:27.900 --> 08:30.980
Do not build systems that assume perfect execution.

08:31.460 --> 08:34.900
Production systems must assume failures will happen regularly.

08:35.540 --> 08:40.940
Designing for failure from day one is what separates resilient systems from fragile ones.

08:41.420 --> 08:43.860
This slide prepares us for the next topic.

08:44.260 --> 08:50.420
The final slide focuses on recovery and resilience as outlined on page nine.

08:50.500 --> 08:55.620
Robust systems require pre-defined strategies for handling failures gracefully.

08:56.300 --> 09:00.620
Retry mechanisms should include limits and exponential backoff.

09:01.220 --> 09:06.860
Fallback options allow systems to switch tools or reduce functionality safely.

09:07.460 --> 09:13.780
Recovery actions such as rollbacks or compensating transactions restore consistent state.

09:14.380 --> 09:18.820
When automation fails, human escalation provides a final safety net.

09:19.460 --> 09:23.820
The guiding principle is clear fail safely, not silently.

09:24.300 --> 09:30.260
Systems must preserve visibility, protect data integrity, and communicate what went wrong.

09:30.900 --> 09:33.700
The key takeaways summarize the entire section.

09:34.180 --> 09:38.900
Multi-step reasoning enables complex workflows and autonomous ages.

09:39.500 --> 09:45.540
Tool chaining demands clear planning, strong observability, and robust recovery mechanisms.

09:46.060 --> 09:49.860
Most importantly, production systems must assume failure.

09:50.420 --> 09:52.340
The final insight is powerful.

09:52.500 --> 09:59.180
Reliable agents are built on control, visibility, and resilience, not intelligence alone.