WEBVTT

00:00.880 --> 00:02.520
Hello everyone and welcome.

00:02.760 --> 00:06.600
In today's video we will learn about guard railing images with the prompt.

00:06.840 --> 00:09.960
So for that I'll first go ahead and create Guard Rail.

00:10.160 --> 00:13.120
Give the name as first link image Guard Rail.

00:13.360 --> 00:19.880
I'll skip the description since it's optional and the message for the blocked prompt would be.

00:19.880 --> 00:25.760
Sorry, the model cannot answer this question with image, and then I'll apply the same block message

00:25.760 --> 00:27.240
for responses as well.

00:27.800 --> 00:29.960
Go ahead, do next.

00:30.440 --> 00:34.960
Then we'll configure the content filters here for a harmful categories.

00:35.320 --> 00:42.120
We saw this in our previous video where we have hate insults, sexual violence, misconduct as a text.

00:42.120 --> 00:46.480
Amazon Bedrock has introduced image on the same category as well.

00:46.720 --> 00:53.920
So any of these categories, if they violate or have harmful categories with the image, it would guard

00:53.920 --> 00:55.560
rail against that as well.

00:56.240 --> 00:57.920
The strength is all the way high.

00:58.120 --> 01:01.040
And then I'll also go ahead and enable prompt attack.

01:01.440 --> 01:02.040
Go do.

01:02.040 --> 01:06.360
Next I'm going to skip all the other guard railing.

01:06.360 --> 01:12.950
That happens with different categories and different features because our focus is going to be mostly

01:12.950 --> 01:13.910
on images.

01:14.190 --> 01:17.790
And so I'll go ahead and then I'll skip profanity filter.

01:17.790 --> 01:23.670
I'll skip the word custom word phrases PII I'll skip the regex patterns.

01:24.110 --> 01:26.510
Also skip the contextual grounding check.

01:26.670 --> 01:31.990
And now all we have is enabled for harmful categories which is text and images.

01:33.430 --> 01:35.950
So now I'll go ahead and create the guardrail.

01:35.950 --> 01:42.070
So on this playground here I'll have to select a model in case the guardrail goes through.

01:42.550 --> 01:45.350
Then it is used to generate the response.

01:45.710 --> 01:52.630
So it's important to notice that to send an image you'll have to select the option that says use Apply

01:52.750 --> 01:54.270
Guardrails API.

01:54.630 --> 02:00.630
And for the input here you have prompts, uploaded text, images and content to be sent to the model.

02:00.870 --> 02:02.750
Now I'll go ahead and upload the image.

02:03.390 --> 02:05.510
So here I have selected this image.

02:05.630 --> 02:08.830
This is a text image that says example text.

02:08.870 --> 02:10.390
I'll just do a confirm.

02:10.710 --> 02:14.950
So now if you go ahead and check the check the image, it's a text image.

02:15.150 --> 02:17.590
And now I'm going to pass the content here.

02:17.750 --> 02:21.270
How can I find the taxes run this okay.

02:21.740 --> 02:25.300
so there was no problem with this input with the image.

02:25.300 --> 02:27.820
Let's go ahead and check the the trace here.

02:27.860 --> 02:30.820
No issues, no action were taken.

02:30.980 --> 02:32.260
This was all good.

02:32.540 --> 02:36.660
Now let's go back and use the very same image with a different text.

02:36.780 --> 02:40.020
So if you notice here you can upload it from your local.

02:40.260 --> 02:45.980
Or also you can use S3 bucket to upload the image I'm uploading from my local.

02:45.980 --> 02:47.420
So go ahead and confirm.

02:47.740 --> 02:53.380
This time I will ask question what are the loopholes for doing this right.

02:53.700 --> 02:56.820
And then we have the file here which is a text file.

02:57.060 --> 02:58.340
Go ahead and run this.

02:59.220 --> 02:59.900
All right.

03:00.420 --> 03:02.980
So now there was a guardrail action taken.

03:02.980 --> 03:04.060
It was intervened.

03:04.060 --> 03:08.460
And it said sorry the model cannot answer this question with image.

03:08.700 --> 03:10.100
It's with the trace here.

03:10.100 --> 03:16.500
And if you notice it categorized as a content filter, the test results were blocked.

03:16.500 --> 03:20.940
And the details here are the detected misconduct and the strength is high.

03:22.020 --> 03:28.700
So now Amazon Bedrock offers the ability to detect harmful intentions with image with the content.

03:29.020 --> 03:29.580
All right.

03:29.820 --> 03:30.940
Thank you so much.

03:30.940 --> 03:32.860
I'll see you in the next video.