WEBVTT

1
00:00:00.180 --> 00:00:02.550
<v ->Hey there, Eden here, and in this video</v>

2
00:00:02.550 --> 00:00:06.420
we are going to be building our front end

3
00:00:06.420 --> 00:00:09.270
to our RAG retrieval pipeline.

4
00:00:09.270 --> 00:00:13.440
So we are going to be building a very simple user interface

5
00:00:13.440 --> 00:00:18.030
so we can visually test and QA what we've done so far.

6
00:00:18.030 --> 00:00:20.970
And for that we're going to be using Streamlit.

7
00:00:20.970 --> 00:00:25.200
So Streamlit is a very popular open-source package

8
00:00:25.200 --> 00:00:28.380
which enable us to create very intuitive

9
00:00:28.380 --> 00:00:31.770
and simple user interfaces in Python.

10
00:00:31.770 --> 00:00:33.180
And yes, I said Python.

11
00:00:33.180 --> 00:00:36.180
We're not going to be writing any JavaScript code today

12
00:00:36.180 --> 00:00:38.670
because with Streamlit we can use Python.

13
00:00:38.670 --> 00:00:43.088
And we can go for the playground, for example.

14
00:00:43.088 --> 00:00:45.300
And we're going to be doing something very similar

15
00:00:45.300 --> 00:00:48.510
to this chat application you're seeing right here.

16
00:00:48.510 --> 00:00:51.300
So we're going to be writing an interface

17
00:00:51.300 --> 00:00:53.070
in Python to do that.

18
00:00:53.070 --> 00:00:55.920
So we're going to be writing, in Python, an application

19
00:00:55.920 --> 00:00:59.040
that is going to yield something very similar to this.

20
00:00:59.040 --> 00:01:00.780
You can check out here in the playground

21
00:01:00.780 --> 00:01:02.490
some other examples.

22
00:01:02.490 --> 00:01:06.240
And by the way, Streamlit originally started as being a tool

23
00:01:06.240 --> 00:01:11.240
for data scientists to visualize their data.

24
00:01:11.550 --> 00:01:13.530
And you can see here, for example,

25
00:01:13.530 --> 00:01:16.590
here we're displaying some charts over some data,

26
00:01:16.590 --> 00:01:19.260
and it's actually very, very intuitive to use.

27
00:01:19.260 --> 00:01:21.840
All this interface, only a couple of lines

28
00:01:21.840 --> 00:01:23.970
of Python code, which is very convenient.

29
00:01:23.970 --> 00:01:27.300
And just a quick disclaimer, this is not for production,

30
00:01:27.300 --> 00:01:30.090
so I wouldn't recommend you using Streamlit

31
00:01:30.090 --> 00:01:33.990
if you want to expose a chat application.

32
00:01:33.990 --> 00:01:37.800
For that I'm going to make another video about generative UI

33
00:01:37.800 --> 00:01:41.700
and show you how you can do it with TypeScript and Next.js.

34
00:01:41.700 --> 00:01:43.530
Cool. So let's go now.

35
00:01:43.530 --> 00:01:46.680
And I created a new main.py file

36
00:01:46.680 --> 00:01:49.590
and here I'm going to be writing the implementation.

37
00:01:49.590 --> 00:01:51.990
So let's start with some imports,

38
00:01:51.990 --> 00:01:55.800
and let's import from streamlit st.

39
00:01:55.800 --> 00:01:58.800
And this is going to be our streamlit application object.

40
00:01:58.800 --> 00:02:01.830
We are going to be working through this entire tutorial.

41
00:02:01.830 --> 00:02:06.180
We want to also import the run_llm function we wrote

42
00:02:06.180 --> 00:02:08.070
from the backend.core.

43
00:02:08.070 --> 00:02:09.660
All right, and those are all the imports

44
00:02:09.660 --> 00:02:11.550
we're going to be needing for this video.

45
00:02:11.550 --> 00:02:13.500
All right, let's go implement a function

46
00:02:13.500 --> 00:02:16.770
which is called _format_sources, which is going

47
00:02:16.770 --> 00:02:20.070
to receive context_docs as input.

48
00:02:20.070 --> 00:02:21.690
And this is going to be a list.

49
00:02:21.690 --> 00:02:25.320
And here it's going to be some LangChain documents,

50
00:02:25.320 --> 00:02:28.260
and it's going to return us a list of URLs.

51
00:02:28.260 --> 00:02:29.550
Why do you want to do it?

52
00:02:29.550 --> 00:02:33.090
Because we want to render nicely for each response

53
00:02:33.090 --> 00:02:37.140
we get from the LLM after the retrieval, we want to cite

54
00:02:37.140 --> 00:02:39.060
where we grounded the answer from.

55
00:02:39.060 --> 00:02:41.370
So we want to display it nicely as a list.

56
00:02:41.370 --> 00:02:43.800
So this is why we need this helper function.

57
00:02:43.800 --> 00:02:47.760
So this function is going to be returning a list of strings,

58
00:02:47.760 --> 00:02:50.730
and we simply need to go and to iterate

59
00:02:50.730 --> 00:02:52.830
over the context_docs.

60
00:02:52.830 --> 00:02:54.960
So this is what we're doing over here.

61
00:02:54.960 --> 00:02:56.970
We're iterating over the context_docs,

62
00:02:56.970 --> 00:02:58.830
which are LangChain documents.

63
00:02:58.830 --> 00:03:02.790
Now if those context_docs, crossing fingers,

64
00:03:02.790 --> 00:03:06.360
are going to have the metadata field, because we added

65
00:03:06.360 --> 00:03:08.790
to the metadata when we index the documents,

66
00:03:08.790 --> 00:03:11.100
if they're going to have a metadata field,

67
00:03:11.100 --> 00:03:13.410
then what we're going to be doing, we are going

68
00:03:13.410 --> 00:03:17.100
to be extracting the URL from the metadata.

69
00:03:17.100 --> 00:03:21.000
So this is why we first check that there is a metadata,

70
00:03:21.000 --> 00:03:24.240
and then if it is, we simply go now

71
00:03:24.240 --> 00:03:26.370
and have a meta variable.

72
00:03:26.370 --> 00:03:29.550
So this line over here helps us to filter out

73
00:03:29.550 --> 00:03:32.970
all the documents that has this metadata key,

74
00:03:32.970 --> 00:03:36.810
and we want to save this metadata into this meta variable.

75
00:03:36.810 --> 00:03:40.560
So now we want to access from this meta variable,

76
00:03:40.560 --> 00:03:41.880
which is the metadata.

77
00:03:41.880 --> 00:03:44.220
We want to extract the sources.

78
00:03:44.220 --> 00:03:45.990
If we don't have any sources,

79
00:03:45.990 --> 00:03:47.670
we want to write here "Unknown."

80
00:03:47.670 --> 00:03:50.190
And eventually you want to cast everything into a string

81
00:03:50.190 --> 00:03:51.840
just in case we don't have a string,

82
00:03:51.840 --> 00:03:53.730
so this is some defensive programming.

83
00:03:53.730 --> 00:03:56.250
So this is the helper function which is going to help us

84
00:03:56.250 --> 00:03:59.820
render nicely the citations of the answer of the LLM.

85
00:03:59.820 --> 00:04:03.120
So let's go now and let's write some Streamlit code.

86
00:04:03.120 --> 00:04:04.560
It's going to be very, very short

87
00:04:04.560 --> 00:04:06.960
and we're going to see how we can get up and running

88
00:04:06.960 --> 00:04:10.260
a very quick interface to prototype our application.

89
00:04:10.260 --> 00:04:12.810
So I want to start by using the streamlit object,

90
00:04:12.810 --> 00:04:16.470
and it has this set_page_config, which is going to configure

91
00:04:16.470 --> 00:04:18.810
the basic settings of the applications,

92
00:04:18.810 --> 00:04:22.440
like the page title and the layout type.

93
00:04:22.440 --> 00:04:24.150
So I'm going to write that,

94
00:04:24.150 --> 00:04:25.320
and let me now in the terminal

95
00:04:25.320 --> 00:04:30.320
write "pipenv run streamlit run main.py" file.

96
00:04:30.780 --> 00:04:32.880
So it's going to run a streamlit application,

97
00:04:32.880 --> 00:04:35.520
which is going to be using the main file

98
00:04:35.520 --> 00:04:37.740
as our source to the application.

99
00:04:37.740 --> 00:04:40.080
And we want to run everything with pipenv,

100
00:04:40.080 --> 00:04:42.570
with the virtual environment we created with pipenv.

101
00:04:42.570 --> 00:04:45.210
And if you're going to be using uv, it's very similar.

102
00:04:45.210 --> 00:04:48.180
To do it simply you need to go to the virtual environment

103
00:04:48.180 --> 00:04:50.910
and run "streamlit run main.py."

104
00:04:50.910 --> 00:04:55.020
So we can see now we popped out a small browser

105
00:04:55.020 --> 00:04:57.780
and we have now an application running.

106
00:04:57.780 --> 00:04:59.010
We don't really see anything

107
00:04:59.010 --> 00:05:00.720
because this application is empty.

108
00:05:00.720 --> 00:05:03.390
So let's go and add something to this application.

109
00:05:03.390 --> 00:05:06.555
And if I'm going to write Command-L here in my browser,

110
00:05:06.555 --> 00:05:09.247
you can see here this is the title of the page,

111
00:05:09.247 --> 00:05:12.210
"LangChain Documentation Helper," which came from here.

112
00:05:12.210 --> 00:05:14.910
And let's go now and add a title to the page.

113
00:05:14.910 --> 00:05:18.210
We're going to be doing that with the method of title,

114
00:05:18.210 --> 00:05:20.640
which is going to receive the body of the text

115
00:05:20.640 --> 00:05:21.600
we want to display.

116
00:05:21.600 --> 00:05:22.597
And here we want to display

117
00:05:22.597 --> 00:05:24.330
"LangChain Documentation Helper."

118
00:05:24.330 --> 00:05:27.240
So if I'm going to go back to the application here,

119
00:05:27.240 --> 00:05:28.560
let me refresh it,

120
00:05:28.560 --> 00:05:31.590
we can see here we have "LangChain Documentation Helper."

121
00:05:31.590 --> 00:05:34.890
And notice this is running now in debug mode by default

122
00:05:34.890 --> 00:05:38.100
so I don't need to stop this application and run it again.

123
00:05:38.100 --> 00:05:40.410
Let's go now and create a sidebar in Streamlit.

124
00:05:40.410 --> 00:05:41.727
And for that we're going to be using

125
00:05:41.727 --> 00:05:44.700
the Streamlit sidebar context manager.

126
00:05:44.700 --> 00:05:48.150
Now everything indented here is going to be a part

127
00:05:48.150 --> 00:05:49.290
of the sidebar.

128
00:05:49.290 --> 00:05:52.080
So let me write subheader session.

129
00:05:52.080 --> 00:05:54.240
And this is going to display the text

130
00:05:54.240 --> 00:05:57.000
in a subheader formatting.

131
00:05:57.000 --> 00:06:01.050
And let's go to the application, let's look how it looks.

132
00:06:01.050 --> 00:06:03.060
So here we can see we have a sidebar

133
00:06:03.060 --> 00:06:04.680
which we can toggle on and off.

134
00:06:04.680 --> 00:06:08.130
And we have here this subheader of session here.

135
00:06:08.130 --> 00:06:11.910
So first thing I want to do is to create a button

136
00:06:11.910 --> 00:06:14.340
which is going to reset our chat,

137
00:06:14.340 --> 00:06:18.570
which is going to allow us to easily debug our application.

138
00:06:18.570 --> 00:06:21.480
So I'm going to use this Streamlit button,

139
00:06:21.480 --> 00:06:24.240
and I'm going to call this button "Clear chat."

140
00:06:24.240 --> 00:06:27.030
And I'm going to use use_container_width=True.

141
00:06:27.030 --> 00:06:29.670
This means that the width of this button is going

142
00:06:29.670 --> 00:06:32.820
to be the width of its container, which is the sidebar.

143
00:06:32.820 --> 00:06:35.340
And here you can see all of the different settings

144
00:06:35.340 --> 00:06:36.810
you can use this function,

145
00:06:36.810 --> 00:06:40.110
but its return value is going to be a boolean

146
00:06:40.110 --> 00:06:43.830
if the button was clicked on the last run of the app or not.

147
00:06:43.830 --> 00:06:46.710
So if somebody is going to click this button,

148
00:06:46.710 --> 00:06:48.390
we want to clear this chat.

149
00:06:48.390 --> 00:06:51.180
So this chat is going to have messages.

150
00:06:51.180 --> 00:06:54.180
And I'm going to give you a quick hint.

151
00:06:54.180 --> 00:06:57.420
We are going to be saving all of the messages

152
00:06:57.420 --> 00:07:00.990
in the front end application, in the Streamlit application,

153
00:07:00.990 --> 00:07:03.960
and we're going to be setting it in a very special place

154
00:07:03.960 --> 00:07:05.700
inside the Streamlit application,

155
00:07:05.700 --> 00:07:07.380
which is called session state.

156
00:07:07.380 --> 00:07:10.260
So when you fire up a Streamlit application,

157
00:07:10.260 --> 00:07:12.150
you basically have a dictionary

158
00:07:12.150 --> 00:07:15.810
where you can store intermediate results and you can store

159
00:07:15.810 --> 00:07:18.750
the data from a previous user interactions.

160
00:07:18.750 --> 00:07:23.460
So we are going to be saving all of the messages we got back

161
00:07:23.460 --> 00:07:26.970
and forth from the user, to the LLM, back to the LLM,

162
00:07:26.970 --> 00:07:29.610
we're going to be saving that in the Streamlit application.

163
00:07:29.610 --> 00:07:32.790
So what we're going to do if somebody is going to click

164
00:07:32.790 --> 00:07:37.650
the button to clear the chat, we want to go and delete

165
00:07:37.650 --> 00:07:40.680
everything from that session, right?

166
00:07:40.680 --> 00:07:44.400
And for that we need to go and access the session_state.

167
00:07:44.400 --> 00:07:46.740
This is the object how it's called in Streamlit.

168
00:07:46.740 --> 00:07:48.600
And this is simply a dictionary.

169
00:07:48.600 --> 00:07:52.890
So we can use the .pop method of that dictionary,

170
00:07:52.890 --> 00:07:56.640
and we simply want to pop the messages key, right?

171
00:07:56.640 --> 00:07:58.360
So if we're going to be doing that,

172
00:07:58.360 --> 00:08:00.540
we're going to be clearing the chat.

173
00:08:00.540 --> 00:08:04.050
And then after we do that, we want to rerun the application.

174
00:08:04.050 --> 00:08:06.510
We want to refresh it to start from a clean slate,

175
00:08:06.510 --> 00:08:08.040
so it's going to render again.

176
00:08:08.040 --> 00:08:10.530
So let's go to the code now.

177
00:08:10.530 --> 00:08:13.290
And here you can see we have now a session here.

178
00:08:13.290 --> 00:08:15.840
Once we click it, it's actually going to clear

179
00:08:15.840 --> 00:08:16.673
the session state.

180
00:08:16.673 --> 00:08:18.330
We don't really have anything right now,

181
00:08:18.330 --> 00:08:20.580
so it's not going to do anything, but later

182
00:08:20.580 --> 00:08:22.560
it's going to remove all of the interaction

183
00:08:22.560 --> 00:08:24.480
and all of the messages between us,

184
00:08:24.480 --> 00:08:26.850
between the user and the LLM.

185
00:08:26.850 --> 00:08:28.560
All right, so we are done with the sidebar.

186
00:08:28.560 --> 00:08:30.810
Now let's go and show some user messages.

187
00:08:30.810 --> 00:08:33.120
All right, so now we want to write some code

188
00:08:33.120 --> 00:08:37.290
which is going to display the messages from the user

189
00:08:37.290 --> 00:08:41.700
and from the LLM, we want to display all the chat messages.

190
00:08:41.700 --> 00:08:45.360
And for that we want to iterate over the messages

191
00:08:45.360 --> 00:08:46.830
and to simply print them.

192
00:08:46.830 --> 00:08:50.040
However, when we were going to be starting the application,

193
00:08:50.040 --> 00:08:51.780
we still don't have any messages.

194
00:08:51.780 --> 00:08:55.410
So let's start by writing a placeholder message.

195
00:08:55.410 --> 00:08:58.320
So if there aren't messages in the session state,

196
00:08:58.320 --> 00:09:00.690
so this means we fired up the application,

197
00:09:00.690 --> 00:09:04.410
nothing happened yet, then we want to add

198
00:09:04.410 --> 00:09:07.200
to the messages key here in the session_state,

199
00:09:07.200 --> 00:09:10.080
we want to add here an example message.

200
00:09:10.080 --> 00:09:12.270
So let's go and add an example message

201
00:09:12.270 --> 00:09:15.420
with the role of "assistant" and the content

202
00:09:15.420 --> 00:09:18.030
of "Ask me anything about LangChain docs.

203
00:09:18.030 --> 00:09:20.970
I'll retrieve relevant context and cite sources."

204
00:09:20.970 --> 00:09:25.140
And let's go and write sources equals to an empty list here.

205
00:09:25.140 --> 00:09:28.830
So this is an artificial message we are going to be storing

206
00:09:28.830 --> 00:09:31.650
in the session_state in the messages key.

207
00:09:31.650 --> 00:09:34.560
I remind you the message_state is simply dictionary,

208
00:09:34.560 --> 00:09:36.900
which is going to have the key of "messages."

209
00:09:36.900 --> 00:09:39.720
And right now we are populating it with a list

210
00:09:39.720 --> 00:09:41.730
that contains one message.

211
00:09:41.730 --> 00:09:43.830
And now we want to go

212
00:09:43.830 --> 00:09:47.370
and we want to iterate over all the messages.

213
00:09:47.370 --> 00:09:51.090
So we are going to be iterating over the session_state

214
00:09:51.090 --> 00:09:51.923
in the messages.

215
00:09:51.923 --> 00:09:53.880
So this is going to be a list of messages,

216
00:09:53.880 --> 00:09:56.970
whether a user message or an AI message.

217
00:09:56.970 --> 00:10:00.720
And here we are going to be iterating on those messages.

218
00:10:00.720 --> 00:10:03.990
So for each message we want now to create a container

219
00:10:03.990 --> 00:10:05.940
which is going to be holding the message.

220
00:10:05.940 --> 00:10:09.540
So let's use the with streamlit.chat_message.

221
00:10:09.540 --> 00:10:12.000
And this is here is going to insert

222
00:10:12.000 --> 00:10:13.650
a chat message container.

223
00:10:13.650 --> 00:10:16.710
And in its parameters we can give it a name.

224
00:10:16.710 --> 00:10:21.240
So it can be either user, assistant, ai, human, or string.

225
00:10:21.240 --> 00:10:23.880
And this is going to give that message a role,

226
00:10:23.880 --> 00:10:26.100
and it's going to have a different theme

227
00:10:26.100 --> 00:10:28.770
for the message accordingly, with an avatar,

228
00:10:28.770 --> 00:10:30.900
and we're going to be seeing it now when we run it.

229
00:10:30.900 --> 00:10:33.030
So for every message I want to go

230
00:10:33.030 --> 00:10:34.560
and I want to display its content.

231
00:10:34.560 --> 00:10:37.620
So everything under this indentation over here,

232
00:10:37.620 --> 00:10:40.020
under chat_message, is going to be displayed

233
00:10:40.020 --> 00:10:41.370
according to the current role.

234
00:10:41.370 --> 00:10:43.890
So here I want to display the content of the message.

235
00:10:43.890 --> 00:10:46.650
I remind you in the example message here we have the role

236
00:10:46.650 --> 00:10:49.890
of "assistant" and the content here, this is the content.

237
00:10:49.890 --> 00:10:51.620
To display the sources, we want to display

238
00:10:51.620 --> 00:10:52.980
it nicely in a dropdown.

239
00:10:52.980 --> 00:10:56.010
So we're going to be using an expander object,

240
00:10:56.010 --> 00:10:58.890
which is going to insert a multielement container

241
00:10:58.890 --> 00:11:00.990
that can be expanded/collapsed.

242
00:11:00.990 --> 00:11:03.690
So we're going to give it the title of sources

243
00:11:03.690 --> 00:11:06.240
and here we're simply going to show the sources.

244
00:11:06.240 --> 00:11:08.580
So we're going to be iterating through all the links

245
00:11:08.580 --> 00:11:10.470
of the sources we're going to be getting.

246
00:11:10.470 --> 00:11:13.560
And we are going to use the Markdown, which is going

247
00:11:13.560 --> 00:11:16.170
to be formatting the strings as Markdown here.

248
00:11:16.170 --> 00:11:18.547
We can see now we have here this first message,

249
00:11:18.547 --> 00:11:20.640
"Ask me anything about LangChain docs.

250
00:11:20.640 --> 00:11:22.920
I'll retrieve the context and cite sources."

251
00:11:22.920 --> 00:11:25.560
And notice here, here we have the avatar of a robot.

252
00:11:25.560 --> 00:11:28.440
And let me go back here to write, for example, "user,"

253
00:11:28.440 --> 00:11:30.870
And this is going to be changing to user here.

254
00:11:30.870 --> 00:11:33.480
So this changed once I changed the role here.

255
00:11:33.480 --> 00:11:35.730
Now here in the sources, for example,

256
00:11:35.730 --> 00:11:39.690
let me put here "www.langchain.com,"

257
00:11:39.690 --> 00:11:41.790
and here we can see the expander

258
00:11:41.790 --> 00:11:44.340
and now it has an element of LangChain.

259
00:11:44.340 --> 00:11:48.297
If we're going to have more, we can also add here

260
00:11:48.297 --> 00:11:51.630
"www.anthropic.com."

261
00:11:51.630 --> 00:11:52.710
Here we can see too.

262
00:11:52.710 --> 00:11:55.890
So everything here is being formatted as Markdown.

263
00:11:55.890 --> 00:12:00.360
And this dash over here is going to be displayed

264
00:12:00.360 --> 00:12:03.540
in Markdown as a list item, for those of you who know.

265
00:12:03.540 --> 00:12:06.510
So let's go and reset this.

266
00:12:06.510 --> 00:12:08.310
So now it's going to be empty.

267
00:12:08.310 --> 00:12:10.740
Cool. So this is the application so far.

268
00:12:10.740 --> 00:12:12.870
So now we want to create the text area

269
00:12:12.870 --> 00:12:15.390
where the user can input the message.

270
00:12:15.390 --> 00:12:18.090
So let's go back, and for that we're going to be using

271
00:12:18.090 --> 00:12:19.920
streamlit chat_input.

272
00:12:19.920 --> 00:12:21.690
And this is going to be a container

273
00:12:21.690 --> 00:12:25.020
which is going to have the chat that the user input.

274
00:12:25.020 --> 00:12:27.577
So here we're going to give it some placeholder text,

275
00:12:27.577 --> 00:12:29.370
"Ask a question about LangChain."

276
00:12:29.370 --> 00:12:33.240
And once the user submits the query under the text area,

277
00:12:33.240 --> 00:12:35.010
it's going to be saved here in prompt.

278
00:12:35.010 --> 00:12:36.600
So let me go and show you that.

279
00:12:36.600 --> 00:12:38.730
Let's go to the application, let's refresh it.

280
00:12:38.730 --> 00:12:40.470
Here you can see the new text area.

281
00:12:40.470 --> 00:12:41.850
Let me write, "Hello."

282
00:12:41.850 --> 00:12:44.100
And this "Hello" is going to be saved now

283
00:12:44.100 --> 00:12:47.143
to this prompt here, and when a user is going to write

284
00:12:47.143 --> 00:12:50.670
a message, "Hello," we want to first display it.

285
00:12:50.670 --> 00:12:52.080
So let's write if prompt,

286
00:12:52.080 --> 00:12:54.630
and this is going to be executed only if the user

287
00:12:54.630 --> 00:12:57.900
pressed Enter or submitted the the input.

288
00:12:57.900 --> 00:13:00.840
So here if the user have inserted a prompt,

289
00:13:00.840 --> 00:13:03.240
so first thing first, we appended it

290
00:13:03.240 --> 00:13:04.137
to the session state now.

291
00:13:04.137 --> 00:13:05.820
Now so we have the history.

292
00:13:05.820 --> 00:13:06.930
And now we want to print it,

293
00:13:06.930 --> 00:13:08.520
we want to display it to the user.

294
00:13:08.520 --> 00:13:12.150
So we are going to be writing with streamlit chat message

295
00:13:12.150 --> 00:13:15.660
like before, but this time it's going to be a user message

296
00:13:15.660 --> 00:13:17.880
because we know for sure at this point of time

297
00:13:17.880 --> 00:13:19.980
that this is going to be a user message

298
00:13:19.980 --> 00:13:22.620
and we simply want to display it in Markdown,

299
00:13:22.620 --> 00:13:23.970
display the user prompt.

300
00:13:23.970 --> 00:13:27.060
So let me go and save and show you how it looks.

301
00:13:27.060 --> 00:13:28.560
Let's go refresh it.

302
00:13:28.560 --> 00:13:30.420
Let me write here, "Hello."

303
00:13:30.420 --> 00:13:32.250
And this is simply printing it.

304
00:13:32.250 --> 00:13:35.340
If we're going to be changing it to AI for example,

305
00:13:35.340 --> 00:13:37.620
and we're going to be writing "Hello,"

306
00:13:37.620 --> 00:13:40.200
I'm going to see it's going to be the avatar of the AI.

307
00:13:40.200 --> 00:13:42.090
So let me change it back to user.

308
00:13:42.090 --> 00:13:45.270
So now we want to create now an assistant message.

309
00:13:45.270 --> 00:13:47.580
All right, so now under this chat message,

310
00:13:47.580 --> 00:13:50.490
everything indented over here is going to be displayed

311
00:13:50.490 --> 00:13:53.220
in the container of the assistant chat message.

312
00:13:53.220 --> 00:13:55.320
So this is going to be the response of the LLM.

313
00:13:55.320 --> 00:13:57.900
So we want to run now our agent,

314
00:13:57.900 --> 00:14:00.150
and we want to execute it with the user input.

315
00:14:00.150 --> 00:14:03.600
So let me now add a try/except clause here.

316
00:14:03.600 --> 00:14:05.670
And this is because we're going to be running

317
00:14:05.670 --> 00:14:09.510
our RAG agent here and the RAG agent may fail.

318
00:14:09.510 --> 00:14:12.060
So if it fails, we still want to display something

319
00:14:12.060 --> 00:14:14.460
to the user and we don't want

320
00:14:14.460 --> 00:14:16.650
this entire application to crash.

321
00:14:16.650 --> 00:14:18.810
So here I'm going to be using the streamlit.error

322
00:14:18.810 --> 00:14:22.170
and the streamlit.exception to do some error handling.

323
00:14:22.170 --> 00:14:25.980
Let's now go and use a spinner object now.

324
00:14:25.980 --> 00:14:30.120
So this is going to be displaying under the chat message

325
00:14:30.120 --> 00:14:33.930
here a spinner, which is going to show the user

326
00:14:33.930 --> 00:14:35.610
something is happening.

327
00:14:35.610 --> 00:14:37.890
And we want to write the text, "Retrieving docs

328
00:14:37.890 --> 00:14:39.360
and generating answer."

329
00:14:39.360 --> 00:14:44.360
And here we want to run the run_llm function

330
00:14:44.370 --> 00:14:45.900
with the user prompt,

331
00:14:45.900 --> 00:14:49.230
and we are going to be getting a response back, right?

332
00:14:49.230 --> 00:14:51.810
So let me now show you how it's going to be looking.

333
00:14:51.810 --> 00:14:53.610
So let's go and write, "Hello."

334
00:14:53.610 --> 00:14:56.220
So here we can see the spinner, "Retrieving docs,"

335
00:14:56.220 --> 00:14:57.360
and we can see it ended

336
00:14:57.360 --> 00:14:59.700
because we got a response from the LLM.

337
00:14:59.700 --> 00:15:03.270
So let's go and extract this response.

338
00:15:03.270 --> 00:15:06.900
So I'm going to get the answer key from the response.

339
00:15:06.900 --> 00:15:09.210
If there is by some reason no answer,

340
00:15:09.210 --> 00:15:12.360
I'm simply going to be writing, "No answer returned."

341
00:15:12.360 --> 00:15:14.280
So here we're going to have the answer,

342
00:15:14.280 --> 00:15:16.590
and now we want to get the sources.

343
00:15:16.590 --> 00:15:20.190
So now we want to have a list of URLs we can display nicely.

344
00:15:20.190 --> 00:15:22.980
So I'm going to be using the _format_sources,

345
00:15:22.980 --> 00:15:25.410
and I'm going to give it the context key,

346
00:15:25.410 --> 00:15:28.560
which is going to hold all of the documents

347
00:15:28.560 --> 00:15:32.070
which grounded the answer, if we retrieved anything.

348
00:15:32.070 --> 00:15:34.500
All right, so now let's go first and let's go

349
00:15:34.500 --> 00:15:35.820
and show the answer.

350
00:15:35.820 --> 00:15:37.530
So let me refresh it.

351
00:15:37.530 --> 00:15:39.147
Let me write, "Hello."

352
00:15:40.710 --> 00:15:45.600
So we got here the answer and our function went through.

353
00:15:45.600 --> 00:15:48.300
And you notice here we didn't get any documents

354
00:15:48.300 --> 00:15:50.760
because we didn't ask anything about LangChain.

355
00:15:50.760 --> 00:15:54.180
And let's go now and print the sources like before.

356
00:15:54.180 --> 00:15:56.790
So if we have sources we want to show them.

357
00:15:56.790 --> 00:15:59.310
So we're going to be using the expander object,

358
00:15:59.310 --> 00:16:01.830
we're going to be giving you the title of "Sources"

359
00:16:01.830 --> 00:16:04.230
and we want to iterate over the sources

360
00:16:04.230 --> 00:16:06.930
and print them in Markdown as a list here.

361
00:16:06.930 --> 00:16:09.300
So this is going to be printing the sources.

362
00:16:09.300 --> 00:16:10.440
Let me now refresh it.

363
00:16:10.440 --> 00:16:13.197
Let me write, "Hello."

364
00:16:15.030 --> 00:16:15.863
We can see we are not printing it

365
00:16:15.863 --> 00:16:18.390
and this is because we didn't retrieve anything.

366
00:16:18.390 --> 00:16:22.917
Let's write now, "What are deep agents?"

367
00:16:24.300 --> 00:16:26.340
We can see it's taking a bit longer.

368
00:16:26.340 --> 00:16:30.360
And here we can see we have an answer about deep agents,

369
00:16:30.360 --> 00:16:32.850
and we can see we have all the sources here,

370
00:16:32.850 --> 00:16:36.210
and we can see we even have a duplicate source.

371
00:16:36.210 --> 00:16:38.280
This is something which happens sometimes

372
00:16:38.280 --> 00:16:40.380
and it's not a problem to fix it.

373
00:16:40.380 --> 00:16:42.750
We can simply remove the duplicates

374
00:16:42.750 --> 00:16:45.270
with some Python set objects.

375
00:16:45.270 --> 00:16:46.590
Maybe we'll do it later.

376
00:16:46.590 --> 00:16:48.870
And now, if I'm going to be sending a message,

377
00:16:48.870 --> 00:16:50.730
notice what's going to be happening.

378
00:16:50.730 --> 00:16:54.540
We can see our previous response disappeared,

379
00:16:54.540 --> 00:16:56.970
and this is because we forgot to save it

380
00:16:56.970 --> 00:16:58.110
in the session state.

381
00:16:58.110 --> 00:16:59.520
So let's go and do that.

382
00:16:59.520 --> 00:17:02.520
Let's add to the session_state the message,

383
00:17:02.520 --> 00:17:05.670
and now we have now the role of assistant

384
00:17:05.670 --> 00:17:08.280
because it's going to be the result of the agent.

385
00:17:08.280 --> 00:17:10.830
The content is going to be the answer of the agent,

386
00:17:10.830 --> 00:17:13.890
and source is going to come from the sources variable

387
00:17:13.890 --> 00:17:17.280
after we formatted them nicely as a list.

388
00:17:17.280 --> 00:17:19.207
All right, so let me write here,

389
00:17:19.207 --> 00:17:24.207
"Hello, what can you do for me?"

390
00:17:24.420 --> 00:17:28.117
And once we get a message, let me write

391
00:17:28.117 --> 00:17:31.110
"Deep agents, explain."

392
00:17:31.110 --> 00:17:35.280
And now we're going to see this text here,

393
00:17:35.280 --> 00:17:38.370
which it output us we're going to be saving.

394
00:17:38.370 --> 00:17:43.350
Boom, we get here an answer, and we get here the sources,

395
00:17:43.350 --> 00:17:47.040
and look up, we can see we still have the history,

396
00:17:47.040 --> 00:17:49.470
and this is because we added the response

397
00:17:49.470 --> 00:17:54.470
of the agent to the Streamlit session state right over here.

398
00:17:54.870 --> 00:17:56.520
And if we'll clear the chat,

399
00:17:56.520 --> 00:17:58.500
we can see we removed everything.

400
00:17:58.500 --> 00:18:01.080
Hello. So we start from a clean slate.

401
00:18:01.080 --> 00:18:04.200
So this is why we wanted this clear chat here.

402
00:18:04.200 --> 00:18:05.400
It's very convenient.

403
00:18:05.400 --> 00:18:07.200
All right, so you can find everything

404
00:18:07.200 --> 00:18:10.470
in the branch 3-frontend-finish.

405
00:18:10.470 --> 00:18:13.200
Let me write git add main.

406
00:18:13.200 --> 00:18:17.907
Let's write git commit -m "added frontend."

407
00:18:18.990 --> 00:18:20.430
Let me git push.

408
00:18:20.430 --> 00:18:24.150
And if you want to access it, you can simply go to the repo,

409
00:18:24.150 --> 00:18:26.430
go to the branches here,

410
00:18:26.430 --> 00:18:28.980
and here you have 3-frontend-finish.

411
00:18:28.980 --> 00:18:32.550
And you can see you have your main.py file.

412
00:18:32.550 --> 00:18:35.160
And here is everything we implemented.

413
00:18:35.160 --> 00:18:37.440
All right, so I hope you enjoyed this video

414
00:18:37.440 --> 00:18:42.440
and we built a very simple interface to run our RAG agent.

415
00:18:42.600 --> 00:18:44.850
Now of course this is not production grade.

416
00:18:44.850 --> 00:18:47.520
In fact, we didn't really reflect to the user

417
00:18:47.520 --> 00:18:49.320
what's happening in the agent.

418
00:18:49.320 --> 00:18:52.080
And when building AI agents, it's really important

419
00:18:52.080 --> 00:18:55.980
to communicate to the user what's happening with the agent,

420
00:18:55.980 --> 00:18:58.230
how they can trust the output of the agent,

421
00:18:58.230 --> 00:19:01.380
because we want to communicate what's the state,

422
00:19:01.380 --> 00:19:04.260
what's being executed, which tool is running right now,

423
00:19:04.260 --> 00:19:05.940
what is the result of the tool.

424
00:19:05.940 --> 00:19:08.250
So this is referred to as generative UI,

425
00:19:08.250 --> 00:19:10.980
and we're going to be covering it as well in this course.

426
00:19:10.980 --> 00:19:12.990
And right now we didn't really handle it.

427
00:19:12.990 --> 00:19:16.410
And in the next video I'm going to show you a better way,

428
00:19:16.410 --> 00:19:19.050
a bit more complex, but it's going to reflect

429
00:19:19.050 --> 00:19:20.760
what's happening in our agent,

430
00:19:20.760 --> 00:19:23.943
and this is going to be involving some generative UI.