WEBVTT

1
00:00:00.000 --> 00:00:02.520
Welcome back!

2
00:00:02.520 --> 00:00:08.240
In this video, we'll take it up a gear and look at how we can combine models together.

3
00:00:08.240 --> 00:00:08.240


4
00:00:08.240 --> 00:00:08.240


5
00:00:08.240 --> 00:00:13.320
So far, when we've been using functionality from the OpenAI API, we've

6
00:00:13.320 --> 00:00:17.960
completed a task by passing an input to the model and receiving an output.

7
00:00:17.960 --> 00:00:17.960


8
00:00:17.960 --> 00:00:18.760


9
00:00:18.760 --> 00:00:25.800
This got us pretty far, from simple question answering to multi-turn conversations and audio translation.

10
00:00:25.800 --> 00:00:25.800


11
00:00:25.800 --> 00:00:25.840


12
00:00:25.840 --> 00:00:29.200
But what if we could do more with the model's output.

13
00:00:29.200 --> 00:00:29.200


14
00:00:29.200 --> 00:00:29.240


15
00:00:29.240 --> 00:00:31.280
Enter model chaining.

16
00:00:31.280 --> 00:00:38.520
Chaining is when models are combined by feeding the output from one model directly into another model as an input.

17
00:00:38.520 --> 00:00:38.520


18
00:00:38.520 --> 00:00:39.600


19
00:00:39.600 --> 00:00:45.160
We can chain multiple calls to the same model together or use different models.

20
00:00:45.160 --> 00:00:45.200


21
00:00:45.200 --> 00:00:45.200


22
00:00:45.200 --> 00:00:49.280
If we chain two text models together, we can ask the model to perform a

23
00:00:49.280 --> 00:00:55.120
task in one call to the API and send it back with an additional instruction.

24
00:00:55.120 --> 00:01:01.360
For example, let's say we draft an email template for a customer, and we want to validate that the template

25
00:01:01.360 --> 00:01:04.360
follows some guidelines, we can send it

26
00:01:04.360 --> 00:01:08.160
back to the model with an instruction to check whether the guidelines are followed.

27
00:01:08.160 --> 00:01:09.080


28
00:01:09.080 --> 00:01:09.080


29
00:01:09.080 --> 00:01:14.160
We can also combine two different types of models, say the Whisper model into a text

30
00:01:14.160 --> 00:01:19.840
model, to perform tasks like summarizing discussion points and next steps from a meeting recording.

31
00:01:19.840 --> 00:01:19.840


32
00:01:19.840 --> 00:01:20.920


33
00:01:20.920 --> 00:01:24.440
Let's have a go at chaining Whisper with a chat model.

34
00:01:24.440 --> 00:01:24.440


35
00:01:24.440 --> 00:01:24.440


36
00:01:24.440 --> 00:01:27.320
Let's use Whisper to extract the attendees from a meeting

37
00:01:27.320 --> 00:01:32.040
recording, where we know it starts with introductions from each of the attendees.

38
00:01:32.040 --> 00:01:32.040


39
00:01:32.040 --> 00:01:32.040


40
00:01:32.040 --> 00:01:37.400
We start by opening the audio file and assigning it audio_file.

41
00:01:37.400 --> 00:01:37.400


42
00:01:37.400 --> 00:01:37.400


43
00:01:37.400 --> 00:01:43.320
Next, we send the audio to the Whisper model and request a transcript with the transcribe method.

44
00:01:43.320 --> 00:01:50.280
To extract the transcript from the response, we extract the value from the text key.

45
00:01:50.280 --> 00:01:50.280


46
00:01:50.280 --> 00:01:50.280


47
00:01:50.280 --> 00:01:55.120
Now that we have the meeting transcript, we can use it to create a prompt for the chat model.

48
00:01:55.120 --> 00:01:55.120


49
00:01:55.120 --> 00:01:55.120


50
00:01:55.120 --> 00:02:00.000
The prompt starts with an instruction to extract the attendee names

51
00:02:00.000 --> 00:02:04.440
from the start of the transcript, then we append the transcript to the end.

52
00:02:04.440 --> 00:02:05.560


53
00:02:05.560 --> 00:02:05.560


54
00:02:05.560 --> 00:02:08.880
We're now ready to send the prompt to the chat model!

55
00:02:08.880 --> 00:02:14.000
We create a request to the ChatCompletion endpoint using the create method.

56
00:02:14.000 --> 00:02:19.960
Inside, we specify the model to use and the messages to send, which is just the prompt in this case.

57
00:02:19.960 --> 00:02:19.960


58
00:02:19.960 --> 00:02:19.960


59
00:02:19.960 --> 00:02:26.800
Finally, we extract the response from the chat model by digging into the nested JSON response.

60
00:02:26.800 --> 00:02:26.800


61
00:02:26.800 --> 00:02:26.800


62
00:02:26.800 --> 00:02:28.400
And there we have it!

63
00:02:28.400 --> 00:02:28.400


64
00:02:28.400 --> 00:02:28.400


65
00:02:28.400 --> 00:02:32.280
As with all of these models, there's no guarantee that the

66
00:02:32.280 --> 00:02:40.720
models will be 100% accurate - Whisper could make a mistake in the transcript or the chat model may summarize incorrectly.

67
00:02:40.720 --> 00:02:44.920
It's important that any application of these models is well-tested,

68
00:02:44.920 --> 00:02:49.400
iterating on the prompts if necessary, to understand its performance thresholds.

69
00:02:49.400 --> 00:02:55.480
Additionally, usage should be restricted to only non-sensitive data, as we may

70
00:02:55.480 --> 00:03:00.640
risk breaching data governance laws by unjustly exposing employee or customer data.

71
00:03:00.640 --> 00:03:00.640


72
00:03:00.640 --> 00:03:00.640


73
00:03:00.640 --> 00:03:04.680
Over to the final exercises!