WEBVTT

1
00:00:00.240 --> 00:00:02.070
<v ->So let's go now to LangSmith</v>

2
00:00:02.070 --> 00:00:03.840
and let's go and review the trace here.

3
00:00:03.840 --> 00:00:06.300
So here we got a long trace.

4
00:00:06.300 --> 00:00:08.550
Now this entire trace, let me show you

5
00:00:08.550 --> 00:00:11.250
how it's going to be matching the architecture.

6
00:00:11.250 --> 00:00:14.790
So let me first go and minimize everything here.

7
00:00:14.790 --> 00:00:18.990
And yeah, so I'm going to minimize it to all the nodes

8
00:00:18.990 --> 00:00:20.040
that are running here

9
00:00:20.040 --> 00:00:24.480
so we can see the entire thing ran for almost 50 seconds

10
00:00:24.480 --> 00:00:26.520
and this is the cost.

11
00:00:26.520 --> 00:00:29.040
It consumed 35K tokens here.

12
00:00:29.040 --> 00:00:30.750
So let me go and compare it

13
00:00:30.750 --> 00:00:33.120
to the reflection architecture here.

14
00:00:33.120 --> 00:00:36.030
So we first start with the responder,

15
00:00:36.030 --> 00:00:38.280
which is going to make the first draft.

16
00:00:38.280 --> 00:00:41.580
So here it's going to have the response field,

17
00:00:41.580 --> 00:00:45.060
it's going to have a critique and it's going to have search.

18
00:00:45.060 --> 00:00:48.060
So this is going to be the draft for our answer

19
00:00:48.060 --> 00:00:50.430
and this is going to be happening here

20
00:00:50.430 --> 00:00:52.080
in the draft note here.

21
00:00:52.080 --> 00:00:53.400
And here we can see the answer

22
00:00:53.400 --> 00:00:55.950
we have here is the base draft.

23
00:00:55.950 --> 00:00:57.420
We have right over here.

24
00:00:57.420 --> 00:01:00.180
We can see we have here the reflection,

25
00:01:00.180 --> 00:01:03.450
which has the missing attribute with missing information

26
00:01:03.450 --> 00:01:07.320
and superfluous attribute with information we should remove

27
00:01:07.320 --> 00:01:10.140
and some search queries we should make here.

28
00:01:10.140 --> 00:01:14.760
So right now the number of tool calls is going to be one,

29
00:01:14.760 --> 00:01:17.580
up until now, 'cause this is going to be a tool call here

30
00:01:17.580 --> 00:01:20.880
and we can even see if we'll go and open it,

31
00:01:20.880 --> 00:01:25.140
let's see the sequence of the, when we called OpenAI.

32
00:01:25.140 --> 00:01:29.040
And if we'll go here to the trace of the actual LLM call,

33
00:01:29.040 --> 00:01:32.490
we can see here it was one invocation of a tool call

34
00:01:32.490 --> 00:01:34.770
which invoked the answer question tool.

35
00:01:34.770 --> 00:01:38.160
So we used the identical object as a tool here.

36
00:01:38.160 --> 00:01:39.420
So this is really cool.

37
00:01:39.420 --> 00:01:42.540
Alright, so this was one tool call.

38
00:01:42.540 --> 00:01:45.150
Let's go and check out the architecture.

39
00:01:45.150 --> 00:01:47.580
After this, we should go and execute the tools.

40
00:01:47.580 --> 00:01:49.380
We should execute the search queries.

41
00:01:49.380 --> 00:01:53.670
So here we went to the execute tool nodes

42
00:01:53.670 --> 00:01:57.030
and this tool node here ran three search queries.

43
00:01:57.030 --> 00:02:00.000
The first one was to query AI-powered

44
00:02:00.000 --> 00:02:03.210
SOC startups venture fundings in 2025.

45
00:02:03.210 --> 00:02:07.200
The second one was autonomous SOC market sizing

46
00:02:07.200 --> 00:02:08.220
and use cases.

47
00:02:08.220 --> 00:02:11.580
And the third one was comparative analysis

48
00:02:11.580 --> 00:02:13.440
of AI SOC platforms here.

49
00:02:13.440 --> 00:02:15.780
Now let me show you something cool here.

50
00:02:15.780 --> 00:02:19.110
They all ran concurrently because tool nodes supports

51
00:02:19.110 --> 00:02:21.210
running multiple tools concurrently here.

52
00:02:21.210 --> 00:02:24.180
So if I'm going to check each tools here,

53
00:02:24.180 --> 00:02:25.860
here we have the start time.

54
00:02:25.860 --> 00:02:28.350
So for the different tools that ran,

55
00:02:28.350 --> 00:02:30.240
the start time stayed the same

56
00:02:30.240 --> 00:02:31.800
because they all ran concurrently.

57
00:02:31.800 --> 00:02:35.760
So after we ran this node, we have now an AI message

58
00:02:35.760 --> 00:02:37.950
with a bunch of tool call results.

59
00:02:37.950 --> 00:02:40.230
So now we go to the revised node.

60
00:02:40.230 --> 00:02:42.000
Let's go back to the architecture.

61
00:02:42.000 --> 00:02:45.600
After we run the execute tools, we go to the reviser,

62
00:02:45.600 --> 00:02:48.630
which has now in its history all of the tools execution,

63
00:02:48.630 --> 00:02:50.100
all of the search results.

64
00:02:50.100 --> 00:02:53.580
It has the first answer, it has also a critique

65
00:02:53.580 --> 00:02:56.730
and now it's going to revise the answer

66
00:02:56.730 --> 00:02:58.260
based on the search results.

67
00:02:58.260 --> 00:03:00.090
And it's going to give another critique

68
00:03:00.090 --> 00:03:02.040
and it's going to add a citation here.

69
00:03:02.040 --> 00:03:04.257
So if I'm going to go back to LangSmith,

70
00:03:04.257 --> 00:03:07.350
the reviser is going to be running now.

71
00:03:07.350 --> 00:03:11.280
We can see here it has all of the history of the tool calls.

72
00:03:11.280 --> 00:03:12.990
Let me go through that.

73
00:03:12.990 --> 00:03:17.100
And we get here now as a response, a revised answer.

74
00:03:17.100 --> 00:03:18.510
So this is also a tool call.

75
00:03:18.510 --> 00:03:20.460
If I'm going to dive deeper here,

76
00:03:20.460 --> 00:03:23.160
we can see this is eventually an LLM call.

77
00:03:23.160 --> 00:03:25.170
If this is the input on first thing

78
00:03:25.170 --> 00:03:27.690
the schema of the revised answer here

79
00:03:27.690 --> 00:03:29.580
with all of the details that we want.

80
00:03:29.580 --> 00:03:31.230
And here after the reviser change,

81
00:03:31.230 --> 00:03:32.790
we have now the conditional loop.

82
00:03:32.790 --> 00:03:35.820
If we counted more than two tool calls,

83
00:03:35.820 --> 00:03:37.590
so we want to go and end.

84
00:03:37.590 --> 00:03:40.650
And if we are not, we want to have another iteration

85
00:03:40.650 --> 00:03:44.610
and we want to go and start executing the tools over here

86
00:03:44.610 --> 00:03:47.700
because here we're going to have different search queries.

87
00:03:47.700 --> 00:03:51.330
So let's go back now to LangSmith, here the event loop

88
00:03:51.330 --> 00:03:53.520
and the event loop returned

89
00:03:53.520 --> 00:03:55.710
that we should go to execute tools.

90
00:03:55.710 --> 00:03:58.800
So we go now to execute tools here.

91
00:03:58.800 --> 00:04:02.430
So notice here, now we have also the reference field

92
00:04:02.430 --> 00:04:04.560
also in our tool call.

93
00:04:04.560 --> 00:04:06.450
And here we have different search queries.

94
00:04:06.450 --> 00:04:09.210
Here we have AI SOC ROI case studies,

95
00:04:09.210 --> 00:04:12.030
autonomous SOC market size on 2025,

96
00:04:12.030 --> 00:04:14.460
and industry adoption in AI SOC.

97
00:04:14.460 --> 00:04:17.670
So those search queries are actually different

98
00:04:17.670 --> 00:04:19.650
from the first search queries we had.

99
00:04:19.650 --> 00:04:22.080
And then we go, we execute the tools.

100
00:04:22.080 --> 00:04:25.980
So after the reviser, we go and execute tools again.

101
00:04:25.980 --> 00:04:28.620
So let's go back to the diagram over here.

102
00:04:28.620 --> 00:04:30.240
We go, we execute the tools,

103
00:04:30.240 --> 00:04:32.580
we then make another revision here

104
00:04:32.580 --> 00:04:33.840
and then we want to finish.

105
00:04:33.840 --> 00:04:36.150
Because by the time we finish the reviser here,

106
00:04:36.150 --> 00:04:38.670
we're going to have four tool calls.

107
00:04:38.670 --> 00:04:41.850
So in the trace here we go and execute the tools.

108
00:04:41.850 --> 00:04:46.260
We run three search queries again.

109
00:04:46.260 --> 00:04:50.880
1, 2, 3 search queries all running concurrently.

110
00:04:50.880 --> 00:04:53.610
Then we go and revise the answer.

111
00:04:53.610 --> 00:04:55.500
And now we go to the event loop.

112
00:04:55.500 --> 00:04:58.680
So after the second revision, let's go and open it.

113
00:04:58.680 --> 00:05:00.330
We go here to the event loop.

114
00:05:00.330 --> 00:05:02.040
And here you would actually expect that

115
00:05:02.040 --> 00:05:04.020
we're going to be finishing the graph execution

116
00:05:04.020 --> 00:05:06.600
because the number of tool calls is greater than two here.

117
00:05:06.600 --> 00:05:09.330
However, let's go and count the number of tool calls.

118
00:05:09.330 --> 00:05:11.400
This is one tool call from the beginning.

119
00:05:11.400 --> 00:05:15.660
This is a second tool call and this is not a tool call.

120
00:05:15.660 --> 00:05:18.360
And up until here we have two tool calls.

121
00:05:18.360 --> 00:05:21.750
And the reason why we go to execute tools here,

122
00:05:21.750 --> 00:05:24.690
because when we execute here, this event loop,

123
00:05:24.690 --> 00:05:27.990
the revised node actually haven't finished yet.

124
00:05:27.990 --> 00:05:30.360
So it hasn't really update the state here.

125
00:05:30.360 --> 00:05:33.030
So if I'm going to go back to the code here,

126
00:05:33.030 --> 00:05:37.830
the reviser node here, this is going to be updated

127
00:05:37.830 --> 00:05:41.070
only after we are going to finish this event loop here.

128
00:05:41.070 --> 00:05:44.580
So when we execute for the second time the event loop here,

129
00:05:44.580 --> 00:05:49.290
the tool count is still less than the max_iteration.

130
00:05:49.290 --> 00:05:51.000
So that's why we don't end.

131
00:05:51.000 --> 00:05:55.650
So that's why after it, we go and do another execute tools.

132
00:05:55.650 --> 00:05:58.560
And by this time after we execute the tools,

133
00:05:58.560 --> 00:06:01.050
even though this is not a tool call over here,

134
00:06:01.050 --> 00:06:03.390
I remind you this is a tool note here,

135
00:06:03.390 --> 00:06:06.180
but we have already updated the state

136
00:06:06.180 --> 00:06:09.540
that executed this note here, the revised note here.

137
00:06:09.540 --> 00:06:12.720
So when we go after these execute tools,

138
00:06:12.720 --> 00:06:15.540
so when we go to this revised note here,

139
00:06:15.540 --> 00:06:17.820
we go back to the event loop.

140
00:06:17.820 --> 00:06:20.280
So now the number is greater than two.

141
00:06:20.280 --> 00:06:23.370
And yeah and now we go to the end here

142
00:06:23.370 --> 00:06:24.900
and now we finish the graph here.

143
00:06:24.900 --> 00:06:29.900
So actually according to the max iterations equals two,

144
00:06:30.210 --> 00:06:33.000
it's actually going to be three iteration.

145
00:06:33.000 --> 00:06:34.740
It's not going to be two iterations,

146
00:06:34.740 --> 00:06:36.360
it's going to be three iterations.

147
00:06:36.360 --> 00:06:39.390
And this was my mistake, so sorry about it.

148
00:06:39.390 --> 00:06:43.980
So eventually we made here three iterations of revisions.

149
00:06:43.980 --> 00:06:48.060
Cool. So this was the Reflexion architecture.

150
00:06:48.060 --> 00:06:53.040
So we saw that actually timing the number of iterations

151
00:06:53.040 --> 00:06:55.140
was actually not that simple.

152
00:06:55.140 --> 00:06:56.880
In the next section,

153
00:06:56.880 --> 00:07:00.480
we are going to be using an LLM as a judge.

154
00:07:00.480 --> 00:07:03.467
So we're not going to rely on these max iterations here.

155
00:07:03.467 --> 00:07:07.440
We are going to rely on an LLM, which is going to decide

156
00:07:07.440 --> 00:07:10.500
whether to make another iteration or not.

157
00:07:10.500 --> 00:07:13.350
And this is going to be in the agentic RAG section,

158
00:07:13.350 --> 00:07:17.760
where you can find the docs in LangGraph agentic RAG,

159
00:07:17.760 --> 00:07:19.740
which is going to be this architecture.

160
00:07:19.740 --> 00:07:21.750
And this is what we're going to be covering

161
00:07:21.750 --> 00:07:22.750
in the next section.