WEBVTT

00:00.800 --> 00:02.560
Hello everyone and welcome.

00:02.880 --> 00:05.400
In this video we will learn about Garrick.

00:05.840 --> 00:09.360
Garrick is an vulnerability scanner tool.

00:09.840 --> 00:15.520
Before we dive deeper into Garrick, let's first understand what is vulnerability.

00:16.080 --> 00:19.400
How list about the vulnerability is here.

00:19.920 --> 00:22.160
First one is LM security.

00:22.480 --> 00:29.160
It is the investigation of the failure modes of LM in use, the conditions that lead to them and their

00:29.160 --> 00:30.280
mitigations.

00:30.320 --> 00:32.280
Unexpected behavior.

00:32.640 --> 00:37.840
Large language models can fail to operate as expected in number of ways.

00:38.400 --> 00:39.920
Sample hallucination.

00:40.280 --> 00:45.520
This means it's an insecure behavior or unwanted behavior that leads to insecurity.

00:46.280 --> 00:52.200
Software vulnerability LM runs on softwares like PyTorch, Cuda.

00:52.560 --> 00:56.800
These softwares and the underlying operating system can be insecure.

00:57.480 --> 00:59.200
LM interactions.

00:59.240 --> 01:03.840
Human interactions with LM can result in unwanted output.

01:04.360 --> 01:11.150
Examples of these interactions and malicious inputs are prompt injections and jailbreak attempts.

01:11.750 --> 01:15.710
These are some of the LLM vulnerability that we have listed here.

01:15.990 --> 01:21.310
However, there can be a lot more that we would uncover during this section.

01:21.910 --> 01:24.590
Let's move on to the next topic, which is Garrick.

01:24.870 --> 01:25.790
What is Garrick?

01:25.790 --> 01:29.830
Garrick finds vulnerability in the L.l.m based tech.

01:30.270 --> 01:36.870
Garrick works with foundation models and any tech using them to determine where the security holes can

01:36.870 --> 01:37.270
be.

01:37.270 --> 01:45.110
With each solution, Garrick identifies LLM failures with of different plugins, probes, and many challenging

01:45.110 --> 01:45.990
probes.

01:46.190 --> 01:50.030
Garrick tries to explore many different LLM failure modes.

01:50.030 --> 01:52.270
It also has a very good reporting to.

01:52.310 --> 01:59.270
Once Garrick finds something, the exact prompt, the goal, and the response is reported.

01:59.830 --> 02:02.870
Let's move on to the next topic of Garrick, which is Garrick.

02:02.870 --> 02:03.670
Features.

02:04.150 --> 02:12.660
It supports and provides security feature Garrick Specifically focuses on risks that are inherent to

02:12.700 --> 02:20.860
LLM deployment, such as prompt injections, jailbreaks, guardrail bypass, text replace, and many

02:20.860 --> 02:21.340
more.

02:21.860 --> 02:24.540
You can automate the scanning using Garrick.

02:25.020 --> 02:28.940
Garrick has a range of probes that does not need supervision.

02:29.300 --> 02:34.860
It will run each of these probes over the model and manage it with appropriate detectors.

02:35.300 --> 02:37.100
It supports different law.

02:38.220 --> 02:40.580
Garrick supports a ton of llms.

02:40.620 --> 02:47.900
That includes OpenAI hugging face, cohere platforms as well as custom Python integrations.

02:48.380 --> 02:50.060
Let's move on to the next topic.

02:50.060 --> 02:51.940
That is, what are components of Garrick.

02:52.260 --> 02:56.740
Very first one that I have listed here is Vulnerability Probe.

02:57.100 --> 02:58.860
Garrick has a range of probes.

02:59.100 --> 03:01.940
It will run each of these probes over the model.

03:02.300 --> 03:09.460
Each probe defines number of ways of testing a generator, which is typically a large language model

03:09.460 --> 03:13.260
on a specific vulnerability or a failure mode.

03:13.740 --> 03:16.080
Another component is generators.

03:16.480 --> 03:23.080
Generators wrap a set of ways of interfacing with a dialog system or a large language model.

03:23.440 --> 03:25.400
Then there are detectors.

03:25.800 --> 03:30.640
The detectors and Garrick automatically detects language model failures.

03:31.080 --> 03:32.760
Some look for key words.

03:32.840 --> 03:33.600
Other uses.

03:33.600 --> 03:36.800
Machine learning classifiers to judge outputs.

03:37.320 --> 03:45.920
And then there is harness that manages the entire workflow harness connects the generator and the probes

03:45.920 --> 03:50.920
and conduct the actual vulnerability scanned through the detectors.

03:51.760 --> 03:59.240
Right now that we learned about Garrick in different vulnerabilities that exist, let's do hands on

03:59.240 --> 04:06.520
activity with Garrick, install Garrick, and find LMDh vulnerabilities on some of the very popular

04:06.520 --> 04:12.240
models like GPT 3.5 on open AI platform.

04:12.600 --> 04:13.360
Thank you.

04:13.800 --> 04:15.920
I'll see you in the next video.