WEBVTT

00:01.080 --> 00:02.160
Hello everyone!

00:02.480 --> 00:11.840
In this video we will create custom PII Entity Recognizer procedure analyzers.

00:12.480 --> 00:18.560
They have a set of predefined set of entity recognizers that we used in our previous video.

00:19.400 --> 00:28.920
However, you can also create custom recognizers without changing the analyzers base code, right?

00:29.520 --> 00:38.960
So to do that let's first install the basic Python packages.

00:45.960 --> 00:54.480
Does take a few minutes to download and install the packages from Presidio, so I'll pause the video.

00:56.240 --> 00:57.520
So now it's done.

00:57.520 --> 01:03.320
Let's go ahead and import the classes from the package.

01:09.750 --> 01:19.630
The classes that we need is, um, analyze engine and pattern cognizer.

01:20.630 --> 01:21.070
Right.

01:24.790 --> 01:30.150
Now, I'll go ahead and use the text that we want to analyze.

01:37.750 --> 01:38.190
Okay.

01:39.430 --> 01:42.510
His name is Mr. Jones and his phone number.

01:42.510 --> 01:44.430
Is this right?

01:44.750 --> 01:52.150
So now let's initialize the, uh, analyzer with analyzer engine.

01:57.510 --> 02:04.270
Then, uh, let's create a pattern recognizer for for analyzing the title.

02:06.310 --> 02:09.190
And name the variable as title recognizer.

02:17.580 --> 02:20.300
The name of this is going to be title.

02:22.780 --> 02:27.020
And the denial list would be.

02:31.660 --> 02:32.340
Mr..

02:35.620 --> 02:36.700
And Mrs..

02:40.140 --> 02:51.700
Right now let's go ahead and do add this particular recognizer to the to the to the registry.

03:05.020 --> 03:08.700
And now let's go ahead and analyze the text.

03:18.490 --> 03:23.770
Text that we want to analyze is text to recognizer.

03:24.690 --> 03:28.410
The entity that we want to recognize is title.

03:31.970 --> 03:34.410
And the language is going to be English.

03:44.210 --> 03:47.090
Right now let's go ahead and execute this.

03:52.610 --> 03:52.930
All right.

03:52.930 --> 03:59.170
So now here is the results from the analyzer.

03:59.210 --> 04:09.850
It did identify the title as an entity that starts from the word count as 12 and ends at 15.

04:10.370 --> 04:20.170
And the score is one which is a very high confidence level because it exactly matches the uh, uh,

04:20.170 --> 04:21.050
the denialist.

04:22.730 --> 04:23.050
All right.

04:23.050 --> 04:23.890
Thank you so much.
