WEBVTT

00:01.170 --> 00:02.220
Instructor: Okay, let's learn about

00:02.220 --> 00:04.350
progressive summarization.

00:04.350 --> 00:08.400
So, the idea behind this is that we want to be able

00:08.400 --> 00:09.480
to take large documents

00:09.480 --> 00:11.370
and make them smaller, summarize them,

00:11.370 --> 00:15.810
but in a way that can handle an arbitrary size of document.

00:15.810 --> 00:18.480
So we summarize a bunch of different documents

00:18.480 --> 00:20.670
and then summarize the summaries as well.

00:20.670 --> 00:23.910
So first, let's install OpenAI.

00:23.910 --> 00:27.090
We're just installing the package here.

00:27.090 --> 00:29.430
And then we want to get our API key,

00:29.430 --> 00:32.850
so this is just to get the OpenAI client

00:32.850 --> 00:34.950
and then we'll do a secret key,

00:34.950 --> 00:37.380
and then we'll just, once we get that secret key,

00:37.380 --> 00:40.080
we'll pass it in to the OpenAI client.

00:40.080 --> 00:42.720
And then we're gonna put in the model as well.

00:42.720 --> 00:46.800
Lemme just grab a secret key,

00:46.800 --> 00:50.557
and I'm just gonna run this, and you see it pops up here,

00:50.557 --> 00:52.950
"Enter your OpenAI secret key."

00:52.950 --> 00:53.850
Just gonna paste it in there,

00:53.850 --> 00:55.893
and now it's saved locally.

00:57.360 --> 00:59.880
And in order to get to that, by the way,

00:59.880 --> 01:00.930
you can just go here.

01:03.030 --> 01:07.650
All right, now, create a function to call OpenAI.

01:07.650 --> 01:10.050
So the way this works is typically

01:10.050 --> 01:13.110
you wanna have something like submit_prompt

01:13.110 --> 01:15.420
or get completion, whatever it is.

01:15.420 --> 01:17.733
And then you're gonna get a response,

01:18.900 --> 01:23.073
and we're gonna use the responses API, it's the newest API.

01:24.300 --> 01:27.240
And so, instead of the completions API we used to do,

01:27.240 --> 01:28.950
so we need to pass in the model.

01:28.950 --> 01:32.010
And then it's like input instead of messages,

01:32.010 --> 01:33.480
although it's still backwards compatible,

01:33.480 --> 01:35.370
you could use input.

01:35.370 --> 01:38.040
And we're just pasting in here the role of user

01:38.040 --> 01:38.970
and then the content,

01:38.970 --> 01:41.850
which is the actual prompt that we wanna put in here.

01:41.850 --> 01:44.000
And then I'm gonna say...

01:45.390 --> 01:46.223
Oops, sorry,

01:47.430 --> 01:49.410
text equals format text.

01:49.410 --> 01:52.770
So that's like the format we want back.

01:52.770 --> 01:55.500
Temperature, max_output_tokens,

01:55.500 --> 01:57.630
and then store false, of course.

01:57.630 --> 02:00.183
So that's gonna be what we actually,

02:01.639 --> 02:03.570
that's going to determine whether OpenAI stores it

02:03.570 --> 02:06.720
on their site or whether we need to pull that back out,

02:06.720 --> 02:08.580
like to get the next response.

02:08.580 --> 02:09.960
But we're not gonna do that.

02:09.960 --> 02:12.240
That's the type of function we need.

02:12.240 --> 02:15.210
And then we're just gonna print out submit_prompt,

02:15.210 --> 02:16.980
say, "Hey GPT are you ready?"

02:16.980 --> 02:20.010
Let's see if it comes back or something.

02:20.010 --> 02:23.250
We're not getting anything here, so that's because

02:23.250 --> 02:25.740
we didn't get the response.

02:25.740 --> 02:28.830
So we're just gonna get the response output

02:28.830 --> 02:30.330
and the content and then the text.

02:30.330 --> 02:33.243
So that's how you get it back from the responses API.

02:33.243 --> 02:35.287
Here we go. It's working.

02:35.287 --> 02:36.270
"Absolutely! I'm ready.

02:36.270 --> 02:37.707
How can I assist you today?"

02:38.580 --> 02:42.600
All right, we wanna create a prompt for summarization.

02:42.600 --> 02:44.160
I'm just gonna use this one here from this

02:44.160 --> 02:46.950
Contently piece, right?

02:46.950 --> 02:49.560
So the prompt is "Summarize this for an eighth grade student

02:49.560 --> 02:51.510
as a tweet of 280 characters."

02:51.510 --> 02:53.550
And then I've added this section here.

02:53.550 --> 02:55.380
No hashtags, no emojis, no links.

02:55.380 --> 02:57.330
Because sometimes when you say as a tweet,

02:57.330 --> 02:59.520
it puts like hashtags and stuff in there,

02:59.520 --> 03:00.540
which is really annoying.

03:00.540 --> 03:03.660
And then you'd put your text here. So that's the prompt.

03:03.660 --> 03:06.360
And then the text gets fed in.

03:06.360 --> 03:10.440
So what we're using here is the format response.

03:10.440 --> 03:14.310
And this is basically looking for anything

03:14.310 --> 03:16.980
in here in curly brackets and then filling it in.

03:16.980 --> 03:19.560
So we've said that the variable is text,

03:19.560 --> 03:21.990
and so it's filling in that text variable here.

03:21.990 --> 03:24.780
So whatever prompt, whatever text we put into that prompt,

03:24.780 --> 03:27.003
is gonna go into the template.

03:28.140 --> 03:30.863
All right, we've got a bunch of different articles here.

03:32.130 --> 03:36.390
These are the different articles that I wanted to summarize.

03:36.390 --> 03:38.250
And there's a bunch of different text.

03:38.250 --> 03:39.850
So I'm just gonna paste this in.

03:43.110 --> 03:45.753
So I just went through and manually got this text,

03:47.370 --> 03:48.930
but you could also write something

03:48.930 --> 03:50.670
that searches the blog posts

03:50.670 --> 03:52.140
and gets the text for you.

03:52.140 --> 03:54.570
Got these different publishers as well.

03:54.570 --> 03:56.550
And what I wanna understand is the topic

03:56.550 --> 03:59.610
of marketing mix modeling or medium mix modeling,

03:59.610 --> 04:01.980
and that these are a bunch of different posts about that

04:01.980 --> 04:03.810
that are ranking on Google.

04:03.810 --> 04:06.570
So I want to summarize all these different posts

04:06.570 --> 04:08.520
and then summarize the summaries as well

04:08.520 --> 04:12.060
so I can get like an overall understanding of the topic.

04:12.060 --> 04:14.910
All right, so let's get going with it.

04:14.910 --> 04:19.290
So let's just say text equals the first text

04:19.290 --> 04:21.033
from that list.

04:22.380 --> 04:26.973
So our formatted_prompt is gonna be summary_prompt,

04:27.900 --> 04:29.310
and then we're gonna format

04:29.310 --> 04:32.733
and we're gonna say the text equals text.

04:33.630 --> 04:35.430
So that's gonna give us the formatted prompt

04:35.430 --> 04:39.510
with the text from that first article.

04:39.510 --> 04:44.510
And then we're gonna just call our function, submit_prompt,

04:46.440 --> 04:49.173
and we're gonna stick in that formatted prompt.

04:52.380 --> 04:53.213
Cool.

04:58.650 --> 05:00.720
All right. And let's just print the response.

05:00.720 --> 05:02.310
Here we go, "Media Mix Modeling

05:02.310 --> 05:03.630
MMR, helps brands figure out

05:03.630 --> 05:05.160
which ads really boost sales.

05:05.160 --> 05:06.750
Old MMM worked well for TV and radio

05:06.750 --> 05:09.480
but struggles with online ads

05:09.480 --> 05:12.300
because platforms adjust spending," okay, cool.

05:12.300 --> 05:15.750
So that's interesting. It's got a good summary there.

05:15.750 --> 05:17.087
But let's hear from other voices, right?

05:17.087 --> 05:20.070
That's just one of the articles I recast.

05:20.070 --> 05:22.680
I wanna hear from some of the other people in the industry.

05:22.680 --> 05:25.080
So what we're gonna do is we're gonna iterate through

05:25.080 --> 05:27.700
everything, I'm just gonna

05:31.530 --> 05:32.943
say all_responses,

05:35.830 --> 05:39.917
and then I'm gonna say the formatted_prompt is text.

05:41.400 --> 05:42.233
Oh, sorry,

05:42.233 --> 05:46.950
for text in texts, right?

05:46.950 --> 05:48.090
So then we're gonna have,

05:48.090 --> 05:51.720
the formatted prompt is gonna take the text for that texts.

05:51.720 --> 05:54.090
So it's gonna get the first result and then second result

05:54.090 --> 05:55.170
and then third result,

05:55.170 --> 05:57.330
it's gonna get the response each time.

05:57.330 --> 06:00.333
Then it's gonna append all the responses together.

06:01.170 --> 06:02.890
We can also print the response

06:04.560 --> 06:08.913
and then maybe print like a line between them so we can see.

06:09.840 --> 06:11.673
All right, let's just run this.

06:13.050 --> 06:15.180
And you could also run this in parallel as well.

06:15.180 --> 06:17.370
That would be a lot faster, but we just wanna see these

06:17.370 --> 06:19.670
one by one and we're not doing a lot of these.

06:22.770 --> 06:27.770
Okay, so look, we're getting the summaries each time,

06:27.990 --> 06:29.190
Which is really helpful.

06:30.900 --> 06:33.420
This one calls it econometrics, which is interesting.

06:33.420 --> 06:34.530
And then, yeah, it looks like

06:34.530 --> 06:36.210
a lot of these summaries are converging.

06:36.210 --> 06:38.400
It's a really good kind of understanding

06:38.400 --> 06:40.800
of what we've got across all these different things,

06:40.800 --> 06:43.500
but maybe we don't wanna read every individual summary.

06:43.500 --> 06:45.030
Maybe that's too much.

06:45.030 --> 06:47.280
So we can actually go and summarize

06:47.280 --> 06:48.750
the summaries as well here.

06:48.750 --> 06:53.750
If we say for response in all_responses,

06:55.470 --> 06:59.643
should say all_responses is, there we go.

07:01.620 --> 07:03.903
So have all the responses together,

07:05.430 --> 07:10.430
just separate it by two new lines.

07:11.370 --> 07:12.783
Oh, that didn't work.

07:13.830 --> 07:18.360
Okay, so we joined all our responses together here.

07:18.360 --> 07:20.460
And then we want to get a summary of that.

07:21.561 --> 07:24.573
So I'm just gonna submit_prompt.

07:24.573 --> 07:28.293
Here we go, that's a concise summary of everything.

07:30.600 --> 07:32.850
All right. See another example of this.

07:32.850 --> 07:36.660
This is quite often used when you have a larger page

07:36.660 --> 07:38.400
and you wanna summarize a whole article.

07:38.400 --> 07:40.500
So if I paste in here,

07:40.500 --> 07:43.500
this is all the Wikipedia text on marketing mix modeling,

07:43.500 --> 07:45.090
and it's separated by commas.

07:45.090 --> 07:47.117
So we have a bunch of different chunks here.

07:47.117 --> 07:49.350
That's what they're called in the industry.

07:49.350 --> 07:51.900
So it's quite common that if you can't fit the whole webpage

07:51.900 --> 07:56.010
or you can't fit a whole report in the context window

07:56.010 --> 07:59.160
of your LLM, then you can put all of these different chunks

07:59.160 --> 08:03.270
together and summarize them with the same type of approach.

08:03.270 --> 08:06.420
So lemme just try this.

08:06.420 --> 08:07.260
We're gonna say

08:07.260 --> 08:09.100
we're gonna get all the Wikipedia

08:14.160 --> 08:14.993
responses

08:17.367 --> 08:20.517
and we're gonna say for text in wikipedia_texts,

08:26.415 --> 08:28.165
the formatted_prompt,

08:31.350 --> 08:35.670
and then we're gonna summarize everything together.

08:35.670 --> 08:36.503
So,

08:41.473 --> 08:42.720
and then get that together.

08:42.720 --> 08:46.553
And then we're gonna get the formatted_prompt,

08:47.608 --> 08:50.775
and then we're gonna get the response,

08:55.470 --> 08:57.120
and then we're gonna print that response.

08:57.120 --> 09:00.120
So what we're doing here, again, this is what we just did,

09:00.120 --> 09:01.230
except with Wikipedia,

09:01.230 --> 09:03.570
we're getting the summaries for everything.

09:03.570 --> 09:05.770
So it's getting the response for the summary

09:06.720 --> 09:09.780
for each individual piece of the page.

09:09.780 --> 09:12.150
And then it's gonna summarize everything together.

09:12.150 --> 09:13.743
The final end of the page.

09:20.250 --> 09:22.350
So you can see it's printing out each one.

09:25.980 --> 09:27.963
You see it talks about Coke and Pepsi,

09:28.920 --> 09:30.423
had to spend money on ads.

09:32.760 --> 09:36.240
And this is, yeah, this is the final result.

09:36.240 --> 09:39.300
And the reason this is useful is just that

09:39.300 --> 09:43.410
typically you can't always summarize the whole article

09:43.410 --> 09:45.420
or like all the articles in one,

09:45.420 --> 09:46.770
because you can't put all of them

09:46.770 --> 09:48.270
in the context window at once.

09:48.270 --> 09:50.160
But if you use progressive summarization

09:50.160 --> 09:52.050
where you summarize different chunks

09:52.050 --> 09:53.280
or summarize different articles,

09:53.280 --> 09:56.640
and then you summarize the summaries, then you have a way

09:56.640 --> 09:59.163
to arbitrarily get to the final result.