WEBVTT

1
00:00:01.210 --> 00:00:03.950
Let's see the same example as we just

2
00:00:03.950 --> 00:00:07.870
saw with searching using RAG, but in a

3
00:00:07.870 --> 00:00:09.050
slightly different manner.

4
00:00:10.070 --> 00:00:12.070
So let's just run it first, and then

5
00:00:12.070 --> 00:00:14.370
we will talk about it.

6
00:00:15.810 --> 00:00:18.230
So in this case, I'm going to say

7
00:00:18.230 --> 00:00:18.950
hello to it.

8
00:00:20.870 --> 00:00:22.830
It just comes back and says hello.

9
00:00:22.990 --> 00:00:25.010
It doesn't do a search, and that's because

10
00:00:25.010 --> 00:00:27.010
I put the search as a tool now,

11
00:00:27.210 --> 00:00:36.250
so only when we search for something, it

12
00:00:36.250 --> 00:00:38.050
will go and do the search on the

13
00:00:38.050 --> 00:00:40.350
fly and give us the answer back.

14
00:00:41.290 --> 00:00:44.610
So the way this is done is we

15
00:00:44.610 --> 00:00:47.530
have exactly the same as we had before.

16
00:00:47.730 --> 00:00:50.390
That has not changed, but we have defined

17
00:00:50.390 --> 00:00:52.230
a search tool down here, and we have

18
00:00:52.230 --> 00:00:56.030
moved all the search we put in here

19
00:00:56.030 --> 00:00:58.730
before out into the search tool.

20
00:00:58.730 --> 00:01:00.290
So now I have a search tool with

21
00:01:00.290 --> 00:01:04.570
an input, and just making the most similarity

22
00:01:04.570 --> 00:01:07.310
knowledge that we just return from the search

23
00:01:07.310 --> 00:01:09.570
tool instead of giving it into the messages.

24
00:01:12.240 --> 00:01:14.900
And then we have given this the tool

25
00:01:14.900 --> 00:01:17.820
of being able to search the knowledge, and

26
00:01:17.820 --> 00:01:20.540
we have instructed it to use that tool.

27
00:01:23.680 --> 00:01:26.320
This has benefits and drawbacks.

28
00:01:29.020 --> 00:01:31.720
So one of the benefits is if we

29
00:01:31.720 --> 00:01:35.500
ask nothing about the knowledge base, we don't

30
00:01:35.500 --> 00:01:37.580
spend tokens answering back.

31
00:01:37.700 --> 00:01:40.720
So the pleasantries here for hello doesn't cost

32
00:01:40.720 --> 00:01:41.300
us extra.

33
00:01:42.800 --> 00:01:46.140
But if you have a very weak model,

34
00:01:46.580 --> 00:01:50.700
or if you ask in a slightly awkward

35
00:01:50.700 --> 00:01:55.040
way, like for example, what is the Wi

36
00:01:55.040 --> 00:01:55.300
-Fi?

37
00:01:59.360 --> 00:02:02.640
You can now see that it doesn't really

38
00:02:02.640 --> 00:02:05.360
understand that it needs to call the tool.

39
00:02:05.920 --> 00:02:08.000
So the risk you're doing here is that

40
00:02:08.000 --> 00:02:11.400
you end up not calling the tool.

41
00:02:12.240 --> 00:02:14.680
And we can begin to prompt our way

42
00:02:14.680 --> 00:02:17.300
out of that, and we can say always

43
00:02:18.610 --> 00:02:22.150
use the search tool.

44
00:02:22.150 --> 00:02:29.850
But if we do that, what is the

45
00:02:29.850 --> 00:02:30.850
Wi-Fi?

46
00:02:32.950 --> 00:02:35.050
It now knows that it needs to do

47
00:02:35.050 --> 00:02:35.370
that.

48
00:02:38.130 --> 00:02:44.610
But if we say hello, it also searches

49
00:02:44.610 --> 00:02:47.930
now because it's now been told that it

50
00:02:47.930 --> 00:02:49.190
always needs to use the tool.

51
00:02:51.290 --> 00:02:53.110
So the best way here is to begin

52
00:02:53.110 --> 00:02:55.890
to go up in models in terms of

53
00:02:55.890 --> 00:02:58.750
them being more intelligent to follow these rules.

54
00:02:59.930 --> 00:03:01.890
So for example, a Chat GPT 4.1

55
00:03:01.890 --> 00:03:08.070
is better at calling the tools and understanding

56
00:03:08.070 --> 00:03:10.270
when to call them and when to not.

57
00:03:10.270 --> 00:03:13.730
So is the Wi-Fi?

58
00:03:15.650 --> 00:03:17.350
Where the other one just asked what is

59
00:03:17.350 --> 00:03:20.210
Wi-Fi or just give an explanation of

60
00:03:20.210 --> 00:03:22.170
what is Wi-Fi in general.

61
00:03:23.290 --> 00:03:26.450
This one is intelligent enough to know, hey,

62
00:03:26.910 --> 00:03:29.130
they're talking about what is the Wi-Fi

63
00:03:29.130 --> 00:03:30.970
at the office or what is the Wi

64
00:03:30.970 --> 00:03:33.050
-Fi in general I probably need to search.

65
00:03:34.850 --> 00:03:36.690
So it's all a balancing.

66
00:03:37.390 --> 00:03:39.370
If you put them up front like we

67
00:03:39.370 --> 00:03:42.510
did in the previous one, you are ensured

68
00:03:42.510 --> 00:03:45.550
that it will always search no matter what.

69
00:03:46.330 --> 00:03:48.970
And in real life scenarios, I do that

70
00:03:48.970 --> 00:03:52.470
at many cases where I know it's not

71
00:03:52.470 --> 00:03:54.450
too expensive to do the search.

72
00:03:55.290 --> 00:03:57.770
But if I know that it's expensive to

73
00:03:57.770 --> 00:03:59.510
do the search, I tend to put it

74
00:03:59.510 --> 00:04:00.510
out in the tool.

75
00:04:01.830 --> 00:04:05.630
Again, also if the real life scenario is

76
00:04:05.630 --> 00:04:12.970
that sometimes this chatbot can actually answer all

77
00:04:12.970 --> 00:04:17.149
kinds of things with other questions like not

78
00:04:17.149 --> 00:04:20.089
the internal knowledge base, but also some support

79
00:04:20.089 --> 00:04:23.410
cases or customer information and so on, then

80
00:04:23.410 --> 00:04:25.350
it's better to have as a tool and

81
00:04:25.350 --> 00:04:28.350
let the AI choose, oh, this is an

82
00:04:28.350 --> 00:04:30.650
internal question about knowledge.

83
00:04:31.110 --> 00:04:33.190
Let me run it.

84
00:04:33.750 --> 00:04:37.270
Oh, it's not about that, then let's not

85
00:04:37.270 --> 00:04:38.050
do it as a tool.

86
00:04:39.030 --> 00:04:40.690
But again, we need to go up in

87
00:04:40.690 --> 00:04:43.130
models in order for it to be a

88
00:04:43.130 --> 00:04:43.930
bit more reliant.

89
00:04:46.090 --> 00:04:49.870
So that is actually everything there is about

90
00:04:49.870 --> 00:04:51.410
making this into a tool.

91
00:04:51.410 --> 00:04:55.710
So nothing special in terms of tool calling,

92
00:04:55.990 --> 00:04:58.290
just like we saw in the section about

93
00:04:58.290 --> 00:04:58.610
that.

94
00:05:00.080 --> 00:05:02.470
The only difference is that now we're just

95
00:05:02.470 --> 00:05:06.510
using RAC inside the tool to get the

96
00:05:06.510 --> 00:05:06.810
data.

97
00:05:07.410 --> 00:05:10.110
So that's everything.
