WEBVTT

00:00.840 --> 00:02.160
Instructor: So let's have an overview

00:02.160 --> 00:06.150
of what's my take in the LLM applications landscape.

00:06.150 --> 00:09.960
I like to classify it into four main categories.

00:09.960 --> 00:12.930
The first one is company slash people

00:12.930 --> 00:17.040
that are writing LLM applications, which all boils down

00:17.040 --> 00:19.740
to a simple LLM call.

00:19.740 --> 00:24.210
Those applications slash features simply send inputs

00:24.210 --> 00:29.100
to the LLM, get a response back, maybe manipulate it a bit,

00:29.100 --> 00:31.620
but eventually display to the user.

00:31.620 --> 00:32.880
And that's it.

00:32.880 --> 00:35.250
Very simple, not sophisticated,

00:35.250 --> 00:37.860
but a lot of time can bring a lot of value

00:37.860 --> 00:40.290
to customers and to users.

00:40.290 --> 00:44.070
For example, an application I liked is this application

00:44.070 --> 00:46.500
that creates children's stories.

00:46.500 --> 00:49.860
When we give it the subjects and the topics.

00:49.860 --> 00:51.840
It will send those to the LLM

00:51.840 --> 00:55.050
and create a children's story with cartoons

00:55.050 --> 00:58.950
and with pictures, and it's very cool in my opinion.

00:58.950 --> 01:02.550
But the overall implementation is pretty simple.

01:02.550 --> 01:06.450
More advanced LLM applications will incorporate some kind

01:06.450 --> 01:07.890
of vector store

01:07.890 --> 01:11.250
and use the retrieval augmentation generation pattern

01:11.250 --> 01:15.000
with semantic search in those vector stores in order

01:15.000 --> 01:17.460
to get the relevant chunks, the relevant data

01:17.460 --> 01:21.570
to answer super specific domain related questions.

01:21.570 --> 01:25.620
For example, an application that I like is called Quiver,

01:25.620 --> 01:28.470
and it's also called the second brain,

01:28.470 --> 01:31.110
where you simply dump all of your information,

01:31.110 --> 01:34.860
whether it's PDFs, whether it's databases,

01:34.860 --> 01:37.740
whether it's videos or chat history,

01:37.740 --> 01:41.640
and it simply index them the information in a vector store

01:41.640 --> 01:45.240
and uses the RAG pattern, the RAG pattern

01:45.240 --> 01:49.380
with semantic search in order to QA over our data.

01:49.380 --> 01:51.690
So the idea behind this application is

01:51.690 --> 01:55.590
that you simply dump it all and then you can chat with it.

01:55.590 --> 01:59.430
Now, if we want to take our application up a notch as far

01:59.430 --> 02:02.910
as LLM complexity, we can incorporate agents

02:02.910 --> 02:07.590
and leverage the LLM reasoning engine in order

02:07.590 --> 02:11.190
to run non-deterministic code

02:11.190 --> 02:15.330
and basically have an agent that will decide which tools

02:15.330 --> 02:17.883
to use when it's the most appropriate.

02:19.170 --> 02:21.420
An interesting use case I saw is

02:21.420 --> 02:24.600
of a cybersecurity company called Torq,

02:24.600 --> 02:29.250
where they created an agent called Socrates which

02:29.250 --> 02:34.250
resolves and remediates alerts with non-deterministic steps.

02:34.770 --> 02:37.920
So it reads the alert information

02:37.920 --> 02:40.410
and then decide how to remediate it

02:40.410 --> 02:43.891
by utilizing the security tooling already connected

02:43.891 --> 02:46.080
to the Torq hyper automation platform.

02:46.080 --> 02:49.050
Excellent example of how to utilize agents

02:49.050 --> 02:52.860
for real world issues like cybersecurity.

02:52.860 --> 02:57.000
Now, the last pattern I wanna talk about is combining agents

02:57.000 --> 03:00.030
and vector stores with semantic search.

03:00.030 --> 03:04.800
So projects like AutoGPT, like GPT Engineer,

03:04.800 --> 03:07.410
they incorporate vector stores

03:07.410 --> 03:11.160
to implement something which is called a long-term memory

03:11.160 --> 03:13.890
and are using semantic search in order

03:13.890 --> 03:16.920
to achieve very advanced capabilities.

03:16.920 --> 03:21.120
Those capabilities can be mimicking human behavior

03:21.120 --> 03:24.060
agents talking and interacting with each other

03:24.060 --> 03:26.520
and solving complex tasks.

03:26.520 --> 03:30.300
Now, all those projects, AutoGPT, GPT Engineer,

03:30.300 --> 03:33.960
Baby AGI, they're in the very, very beginning

03:33.960 --> 03:37.620
and they're pioneering what's called autonomous agents.

03:37.620 --> 03:40.740
So to summarize this session, my main point was

03:40.740 --> 03:44.730
that we can classify every LLM application today to one

03:44.730 --> 03:46.770
of those categories.

03:46.770 --> 03:49.320
My goal in this course was to teach you how

03:49.320 --> 03:51.600
to implement those patterns yourself.

03:51.600 --> 03:53.370
So we learned about agents

03:53.370 --> 03:55.350
and we learned about vector stores

03:55.350 --> 03:57.300
and how to interact with the LLMs

03:57.300 --> 03:59.520
and the theory behind it so you can go

03:59.520 --> 04:01.443
and build your own applications.