1
00:00:05,910 --> 00:00:12,330
In this lesson we are going to talk about what are multimodal LM applications.

2
00:00:17,110 --> 00:00:26,020
So until now we have seen LLM applications solving very interesting problems around text.

3
00:00:26,770 --> 00:00:35,980
But what happens when we have a document that has text table tables and images?

4
00:00:36,070 --> 00:00:42,040
How can we, for example, uh, create a rack application with a PDF?

5
00:00:42,610 --> 00:00:48,460
Usually the PDFs are text plus images and in many cases tables.

6
00:00:48,460 --> 00:00:49,660
Or what happened?

7
00:00:49,660 --> 00:00:56,620
If we want to create a rack application with a PowerPoint presentation, the typical PowerPoint presentation

8
00:00:56,620 --> 00:01:01,060
has a lot of images, tables, and some text.

9
00:01:01,270 --> 00:01:10,660
Or what if we want to use a rack application with handwritten notes or handwritten graphs, etc.?

10
00:01:13,190 --> 00:01:16,910
For that we have the multi modal LM applications.

11
00:01:16,910 --> 00:01:27,110
With the new multi modal LM applications we can manage text tables, images and also other elements

12
00:01:27,110 --> 00:01:29,810
as we will see like audio for example.

13
00:01:32,870 --> 00:01:42,470
In the next lesson we are going to see what LM Models Foundation LM models will we use in order to build

14
00:01:42,800 --> 00:01:44,930
multimodal LM applications?

15
00:01:44,930 --> 00:01:51,410
Because as as we know, the regular LM models, we are used to, uh.

16
00:01:52,870 --> 00:01:59,620
Utilize for our LM applications are not able to handle images or tables.

17
00:01:59,620 --> 00:02:07,480
So what foundation LM models are we going to use with multimodal LM applications?

18
00:02:07,480 --> 00:02:10,180
We will see this in the next lesson.

