WEBVTT

00:01.080 --> 00:08.320
Hello everyone, and welcome to the course I guardrails, where you will learn different techniques

00:08.720 --> 00:12.200
to guardrail a generative AI application.

00:13.200 --> 00:22.480
We will use models, frameworks and platforms to achieve guardrails on a JNI application.

00:24.320 --> 00:29.720
So now let's go ahead and understand how a basic JNI application works.

00:32.400 --> 00:36.640
So here is the 10,000 foot view of a JNI application.

00:37.760 --> 00:43.600
There are users and they provide input to the foundation model.

00:44.480 --> 00:51.520
The foundation models in return generate outputs and send it back to the users.

00:53.320 --> 01:01.160
This is a very simple and straightforward way to understand how the interaction between a user and a

01:01.160 --> 01:02.680
foundation model works.

01:03.680 --> 01:16.130
However, there are challenges we need to detect and mitigate malicious user inputs and malicious foundation

01:16.130 --> 01:21.010
model outputs using input and output guardrails.

01:23.090 --> 01:31.290
Understanding the flow from a user interaction to model output is crucial for identifying where security

01:31.290 --> 01:36.770
vulnerabilities can emerge, and guardrails must be implemented.

01:38.610 --> 01:45.130
That brings us to the next topic that is user input guardrails.

01:47.090 --> 01:54.970
User input guardrails intercept and analyze user prompts before they reach foundation models to prevent

01:55.010 --> 01:58.530
malicious inputs and inappropriate contents.

02:00.330 --> 02:06.050
This includes toxic language, hate, or violent content.

02:07.210 --> 02:10.070
Now let's go back to our original flow.

02:11.150 --> 02:16.190
That is, user input sends the request to the foundation model.

02:17.710 --> 02:27.630
So now user input guardrails they intercept the user request to detect prompt injection.

02:30.230 --> 02:34.830
It intercepts the request and checks for potential issues.

02:37.150 --> 02:47.270
Content moderation is another one where the user inputs are intercepted and content moderation is detected.

02:47.630 --> 02:52.670
To make sure that the foundation models get clean requests.

02:53.870 --> 03:01.670
We will explore both of these detections in detail during the course.

03:03.070 --> 03:07.830
Now let's talk about foundation model response guardrails.

03:10.830 --> 03:13.400
So foundation model response guardrails.

03:14.320 --> 03:23.920
They analyze generated response to ensure accuracy, relevancy, and safety before the delivery happens

03:23.920 --> 03:24.760
to the users.

03:26.320 --> 03:34.520
Now let's go back to our original diagram where user sends a request to the foundation models.

03:36.040 --> 03:41.880
When a foundation model generates response, we must detect hallucination.

03:43.560 --> 03:46.640
Answer relevancy and content moderation.

03:47.960 --> 03:57.280
These steps help ensure that the output is accurate, appropriate, and does not have content which

03:57.280 --> 03:58.040
are not.

04:00.240 --> 04:00.960
Toxic.

04:02.320 --> 04:09.960
We will take a deep dive into each one of this in our course, and learn about the techniques to detect

04:10.520 --> 04:14.760
user input guardrails and foundation model guardrails.

04:15.440 --> 04:16.080
Thank you.

04:16.920 --> 04:17.600
I'll see you in the.