WEBVTT

1
00:00:01.160 --> 00:00:04.260
In previous lectures, I have hinted at something

2
00:00:04.260 --> 00:00:07.560
called Structured Output and it's finally time to

3
00:00:07.560 --> 00:00:08.580
learn about it.

4
00:00:09.560 --> 00:00:12.800
The best way to learn about Structured Output

5
00:00:12.800 --> 00:00:15.620
is to see the problem that we will

6
00:00:15.620 --> 00:00:18.540
face and then see how we fix it

7
00:00:18.540 --> 00:00:19.500
with Structured Output.

8
00:00:20.300 --> 00:00:22.480
So in this case, I have a normal

9
00:00:22.480 --> 00:00:25.820
client and a normal agent and this agent

10
00:00:25.820 --> 00:00:27.860
have given the instructions that it's a movie

11
00:00:27.860 --> 00:00:28.240
expert.

12
00:00:29.120 --> 00:00:31.240
So I want to give it the question,

13
00:00:31.600 --> 00:00:34.860
list the top three best movies according to

14
00:00:34.860 --> 00:00:37.500
IMDB, which is a website for movies if

15
00:00:37.500 --> 00:00:38.200
you don't know them.

16
00:00:39.460 --> 00:00:42.440
And if we do this normally, we can

17
00:00:42.440 --> 00:00:44.120
certainly get a message back.

18
00:00:50.080 --> 00:00:52.700
And what we got back here in this

19
00:00:52.700 --> 00:00:55.020
case, let's give it a little more room,

20
00:00:55.660 --> 00:00:58.200
is that we got back the Shawshank Redemption.

21
00:00:58.740 --> 00:01:02.060
We got as of October 23 and we

22
00:01:02.060 --> 00:01:06.120
had some things about it being fluctuating because

23
00:01:06.120 --> 00:01:07.680
of course it could have changed since it

24
00:01:07.680 --> 00:01:09.200
hasn't.

25
00:01:10.500 --> 00:01:12.920
And it chose to give us back a

26
00:01:12.920 --> 00:01:16.140
1-2-3 in Markdown where they put

27
00:01:16.140 --> 00:01:19.920
the year in here and it gave the

28
00:01:19.920 --> 00:01:21.320
IMDB rating as well.

29
00:01:22.820 --> 00:01:25.340
So this might be good enough for a

30
00:01:25.340 --> 00:01:30.020
chatbot because this is nice, but only if

31
00:01:30.020 --> 00:01:31.620
we wanted to make a website where we

32
00:01:31.620 --> 00:01:35.060
listed those three movies and put the rating

33
00:01:35.060 --> 00:01:37.200
in a specific field, the year in a

34
00:01:37.200 --> 00:01:39.540
specific field and so on, we would need

35
00:01:39.540 --> 00:01:41.040
to sit and pass this text.

36
00:01:42.300 --> 00:01:46.120
And we could do that, but then something

37
00:01:46.120 --> 00:01:47.880
like this will happen if we ask again.

38
00:01:52.860 --> 00:01:55.480
We get it slightly in a different manner.

39
00:01:55.900 --> 00:01:59.980
This time it's still using Markdown, still using

40
00:01:59.980 --> 00:02:02.380
1-2-3, but instead of the rating,

41
00:02:03.000 --> 00:02:05.000
it thought it was best to give back

42
00:02:05.000 --> 00:02:07.580
the genre of the movie.

43
00:02:08.280 --> 00:02:11.040
So this is non-deterministic.

44
00:02:11.600 --> 00:02:14.200
Every time we ask, we might get a

45
00:02:14.200 --> 00:02:15.320
different scenario.

46
00:02:16.080 --> 00:02:19.020
Sometimes it will give back rating, sometimes the

47
00:02:19.020 --> 00:02:24.180
genre, sometimes the director, sometimes it's Markdown, sometimes

48
00:02:24.180 --> 00:02:26.780
it's not, sometimes it's a list, sometimes it's

49
00:02:26.780 --> 00:02:27.120
not.

50
00:02:27.760 --> 00:02:32.040
So we would be probably worse off with

51
00:02:32.040 --> 00:02:35.520
an AI trying to pass this because we

52
00:02:35.520 --> 00:02:36.940
can't just live with text.

53
00:02:38.260 --> 00:02:42.180
So for that reason, we have structured output.

54
00:02:43.040 --> 00:02:49.060
So structured output is essentially just calling RunAsync

55
00:02:49.060 --> 00:02:52.300
like we did before, but telling we want

56
00:02:52.300 --> 00:02:52.980
it like this.

57
00:02:53.860 --> 00:02:55.920
So what we have here is a movie

58
00:02:55.920 --> 00:02:56.440
result.

59
00:02:57.600 --> 00:03:01.560
This is just a class, and inside it,

60
00:03:01.580 --> 00:03:03.100
it has a list of movies.

61
00:03:04.000 --> 00:03:06.880
And these movies are the title of the

62
00:03:06.880 --> 00:03:09.020
movie, director, the year of release, and the

63
00:03:09.020 --> 00:03:12.240
IMDb score, which is none of these gave

64
00:03:12.240 --> 00:03:12.640
it back.

65
00:03:13.380 --> 00:03:14.940
And of course, we could have tried to

66
00:03:14.940 --> 00:03:18.160
write in our description, not that it's only

67
00:03:18.160 --> 00:03:19.120
an expert.

68
00:03:19.500 --> 00:03:21.600
We could have said, we always want to

69
00:03:21.600 --> 00:03:24.560
have this, this, and this information about each

70
00:03:25.080 --> 00:03:26.900
language, but we will still need to pass

71
00:03:26.900 --> 00:03:27.280
text.

72
00:03:28.700 --> 00:03:32.860
But doing it like this, we will instead,

73
00:03:33.260 --> 00:03:37.000
when we run Async here, we would get

74
00:03:37.000 --> 00:03:40.220
a response back, which essentially is JSON.

75
00:03:41.080 --> 00:03:45.340
And now we have full control over how

76
00:03:45.340 --> 00:03:49.300
things look because what happened here was that

77
00:03:50.640 --> 00:03:55.260
beside our question we sent, we also sent

78
00:03:55.260 --> 00:03:58.700
a JSON schema on how we want it

79
00:03:58.700 --> 00:03:59.020
back.

80
00:03:59.740 --> 00:04:03.980
So agent framework is really nice at just

81
00:04:03.980 --> 00:04:09.000
giving us this object and turning it into

82
00:04:09.000 --> 00:04:13.840
a schema, having these fields, and saying we

83
00:04:13.840 --> 00:04:14.540
can get it back.

84
00:04:15.160 --> 00:04:16.440
And we get it back as JSON.

85
00:04:16.620 --> 00:04:18.519
We even get it back with a dot

86
00:04:18.519 --> 00:04:21.019
result here because the new response is a

87
00:04:21.019 --> 00:04:24.520
chat client agent response of type movie result.

88
00:04:26.180 --> 00:04:28.480
So now we get our result, so we

89
00:04:28.480 --> 00:04:31.840
can simply inspect it like any other .NET

90
00:04:31.840 --> 00:04:35.320
object, and we get our data back.

91
00:04:36.480 --> 00:04:40.460
In this case, I'm printing it out in

92
00:04:40.460 --> 00:04:44.540
format here, but of course, we could send

93
00:04:44.540 --> 00:04:48.640
this result back, send it over the wire,

94
00:04:48.800 --> 00:04:52.200
use it in a HTML GUI, or whatever

95
00:04:52.200 --> 00:04:53.440
we need to do.

96
00:04:54.640 --> 00:04:57.880
So we can also see the raw JSON,

97
00:04:58.580 --> 00:05:04.300
and this is the power of structured output

98
00:05:04.300 --> 00:05:09.520
because unless you're doing chatbots, this is what

99
00:05:09.520 --> 00:05:13.160
you want like 95% of the time.

100
00:05:20.060 --> 00:05:23.340
So this is very, very powerful, and we

101
00:05:23.340 --> 00:05:28.440
can use these objects throughout how we want.

102
00:05:29.160 --> 00:05:30.880
It costs a little extra tokens.

103
00:05:31.240 --> 00:05:33.700
I didn't show the token count, but because

104
00:05:33.700 --> 00:05:35.800
we need to send a schema on how

105
00:05:35.800 --> 00:05:40.060
the object looks like, it will cost more

106
00:05:40.060 --> 00:05:40.480
tokens.

107
00:05:40.960 --> 00:05:44.680
It will take a little longer, but it

108
00:05:44.680 --> 00:05:47.240
is definitely worth it, and this is my

109
00:05:47.240 --> 00:05:51.480
favourite feature of any LLM that we can

110
00:05:51.480 --> 00:05:52.540
do structured outputs.
