1
00:00:00,220 --> 00:00:02,880
AI applications can be quite autonomous with

2
00:00:02,880 --> 00:00:04,420
their responses.

3
00:00:04,420 --> 00:00:06,400
And while we can define an output structure

4
00:00:06,400 --> 00:00:09,300
and pass several instructions in our prompts,

5
00:00:09,300 --> 00:00:11,960
some guardrails are best implemented at the

6
00:00:11,960 --> 00:00:13,220
root level.

7
00:00:13,220 --> 00:00:15,820
In this video, we'll be implementing safety

8
00:00:15,820 --> 00:00:18,440
settings that will ensure that our application

9
00:00:18,440 --> 00:00:21,060
does not generate any harmful response.

10
00:00:21,060 --> 00:00:23,440
We don't want our application generating content

11
00:00:23,440 --> 00:00:26,120
that becomes offensive to its users,

12
00:00:26,120 --> 00:00:29,360
especially being a family-friendly application.

13
00:00:29,360 --> 00:00:32,240
Luckily for us, Gemini provides the tools

14
00:00:32,240 --> 00:00:33,540
to do that.

15
00:00:33,540 --> 00:00:35,360
To implement this, we're going to be defining

16
00:00:35,360 --> 00:00:37,460
some general safety settings that we'll be

17
00:00:37,460 --> 00:00:40,480
applying to all our requests.

18
00:00:40,480 --> 00:00:43,140
We'll be using Gemini's standard AMP categories

19
00:00:43,140 --> 00:00:46,640
and setting a threshold for each of them.

20
00:00:46,640 --> 00:00:48,860
This will ensure that Gemini blocks any malicious

21
00:00:48,860 --> 00:00:51,960
query trying to trigger AMPful responses and

22
00:00:51,960 --> 00:00:54,760
also prevents the model from recklessly generating

23
00:00:54,760 --> 00:00:56,040
such.

24
00:00:56,040 --> 00:01:00,740
So let us begin by writing our safety settings.

25
00:01:00,740 --> 00:01:04,739
We are going to come down under the model

26
00:01:04,739 --> 00:01:08,160
definitions and here we'll begin writing

27
00:01:08,160 --> 00:01:09,720
our safety settings.

28
00:01:09,720 --> 00:01:11,640
We need to have them in a list so I'm going

29
00:01:11,640 --> 00:01:18,480
to create a variable called safetySettingsList.

30
00:01:18,480 --> 00:01:22,040
Set that to a list.

31
00:01:22,040 --> 00:01:23,940
Now the first category we're going to be dealing

32
00:01:23,940 --> 00:01:25,840
with is hate speech.

33
00:01:25,840 --> 00:01:31,280
So we're going to say types.safetySetting,

34
00:01:31,280 --> 00:01:38,200
this will take a category and threshold parameters.

35
00:01:38,200 --> 00:01:46,640
So we say category first, say types.harmCategory.

36
00:01:46,640 --> 00:01:49,580
And like I said, we want to deal with hate

37
00:01:49,580 --> 00:01:50,260
speech.

38
00:01:50,260 --> 00:01:54,380
So harm category, hate speech.

39
00:01:54,380 --> 00:01:55,720
Good.

40
00:01:55,720 --> 00:01:57,180
Now the threshold we'll be setting for that

41
00:01:57,180 --> 00:02:01,100
is that we want to block all its pitch.

42
00:02:01,100 --> 00:02:06,960
So the threshold will be types.andBlocksThreshold

43
00:02:06,960 --> 00:02:15,960
and we want to block low and above.

44
00:02:15,960 --> 00:02:17,120
Good.

45
00:02:17,120 --> 00:02:19,740
The next one we're going to be doing is for

46
00:02:19,740 --> 00:02:20,500
harassment.

47
00:02:20,500 --> 00:02:25,140
So I'm just going to copy this.

48
00:02:25,140 --> 00:02:29,060
here and here we're going to select the category

49
00:02:29,060 --> 00:02:31,620
for harassment.

50
00:02:31,620 --> 00:02:32,940
See?

51
00:02:32,940 --> 00:02:34,100
Harassment.

52
00:02:34,100 --> 00:02:38,040
And we're also going to be blocking everything.

53
00:02:38,040 --> 00:02:39,440
Lastly, we're going to deal

54
00:02:39,440 --> 00:02:43,160
with sexually explicit content.

55
00:02:43,160 --> 00:02:47,820
So we'll say arm category sexually explicit

56
00:02:47,820 --> 00:02:48,140
and

57
00:02:48,140 --> 00:02:50,700
we're also going to be blocking everything.

58
00:02:50,700 --> 00:02:53,000
So we're keeping the entire

59
00:02:53,000 --> 00:02:54,740
application PG.

60
00:02:54,740 --> 00:02:57,320
Now that we have our safety settings, let

61
00:02:57,320 --> 00:02:58,460
us add them to our

62
00:02:58,460 --> 00:02:59,460
requests.

63
00:02:59,460 --> 00:03:02,320
This can be done by adding a safety settings

64
00:03:02,320 --> 00:03:03,640
parameter to our

65
00:03:03,640 --> 00:03:04,960
generation config.

66
00:03:04,960 --> 00:03:06,900
So let us do that.

67
00:03:06,900 --> 00:03:13,080
Let's scroll down to our first endpoint

68
00:03:13,080 --> 00:03:18,980
And down here, we can set the safety settings

69
00:03:18,980 --> 00:03:25,800
parameter, and that can be set to our safety

70
00:03:25,800 --> 00:03:29,320
– let me just copy that, let's go back up

71
00:03:29,320 --> 00:03:36,340
– safety settings list.

72
00:03:36,340 --> 00:03:36,840
Good.

73
00:03:36,840 --> 00:03:38,280
Good.

74
00:03:38,280 --> 00:03:40,500
Finally, we need to check our response to

75
00:03:40,500 --> 00:03:43,580
see if there was a block so that we can undo

76
00:03:43,580 --> 00:03:44,720
it.

77
00:03:44,720 --> 00:03:47,040
In any case where any of this type of language

78
00:03:47,040 --> 00:03:49,500
or generation is noticed, Gemini is going

79
00:03:49,500 --> 00:03:50,400
to automatically block it.

80
00:03:50,400 --> 00:03:52,720
So we need to know when our response is blocked

81
00:03:52,720 --> 00:03:55,040
so that we can undo it appropriately on the

82
00:03:55,040 --> 00:03:56,200
back end.

83
00:03:56,200 --> 00:03:58,780
In this case, we will simply just add an if-else

84
00:03:58,780 --> 00:04:01,900
statement to our return value and print out

85
00:04:01,900 --> 00:04:04,260
something to the console.

86
00:04:04,260 --> 00:04:11,300
So up here, I'm just going to say if response.candidates,

87
00:04:11,300 --> 00:04:19,700
that is if there's a response, and if response.candidates,

88
00:04:19,700 --> 00:04:22,940
the very first one, which for now, it only

89
00:04:22,940 --> 00:04:28,120
just returns one, if the finish reason is

90
00:04:28,120 --> 00:04:32,760
equal to types.finishRaising,

91
00:04:32,760 --> 00:04:36,640
I need a double equal there because it's comparison.

92
00:04:36,640 --> 00:04:42,780
So if it's double equal to types.finishRaising

93
00:04:42,780 --> 00:04:49,840
and if that finishRaising is for safety, then

94
00:04:49,840 --> 00:04:52,720
I know that a handful content has just been

95
00:04:52,720 --> 00:04:55,620
blocked and for that we're just going to print

96
00:04:55,620 --> 00:05:04,880
and say response blocked due to safety filter.

97
00:05:04,880 --> 00:05:07,140
So our safety filters have helped us block

98
00:05:07,140 --> 00:05:10,680
a particular response.

99
00:05:10,680 --> 00:05:14,840
Otherwise, we just have business as usual,

100
00:05:14,840 --> 00:05:16,820
which would be these guys.

101
00:05:16,820 --> 00:05:19,480
I'm just going to indent them in,

102
00:05:19,480 --> 00:05:23,220
and we'll just return our normal results.

103
00:05:23,220 --> 00:05:24,780
Now, with this in place, our application is

104
00:05:24,780 --> 00:05:25,660
now safeguarded

105
00:05:25,660 --> 00:05:27,900
from generating harmful responses.

106
00:05:27,900 --> 00:05:29,660
Now, this might be a bit difficult to test

107
00:05:29,660 --> 00:05:33,620
without writing something harmful in the request,

108
00:05:33,620 --> 00:05:34,940
but be rest assured that your application

109
00:05:34,940 --> 00:05:35,820
is now protected

110
00:05:35,820 --> 00:05:38,120
from returning offensive responses,

111
00:05:38,120 --> 00:05:40,420
according to the Gemini API's definition

112
00:05:40,420 --> 00:05:44,280
of what is categorized as offensive responses.

113
00:05:44,280 --> 00:05:45,700
Now, as your own work,

114
00:05:45,700 --> 00:05:48,720
try implementing this for our image search

115
00:05:48,720 --> 00:05:50,040
endpoint.

116
00:05:50,040 --> 00:05:52,320
Our getWeather endpoint might not need this

117
00:05:52,320 --> 00:05:54,240
because it uses a function.

118
00:05:54,240 --> 00:05:55,900
And the function basically defines what is

119
00:05:55,900 --> 00:05:56,400
returned.

120
00:05:56,400 --> 00:05:59,640
So there's very almost impossible low probability

121
00:05:59,640 --> 00:06:01,840
of anything offensive coming from there.

122
00:06:01,840 --> 00:06:04,840
But yeah, just take a shot at the suggest

123
00:06:04,840 --> 00:06:06,740
by image endpoint

124
00:06:06,740 --> 00:06:10,480
and also ensure that the app is safeguarded

125
00:06:10,480 --> 00:06:12,000
on that front.