1
00:00:00,180 --> 00:00:03,340
In this section, we'll be developing the image-based

2
00:00:03,340 --> 00:00:05,440
search feature for our travel

3
00:00:05,440 --> 00:00:07,000
companion application.

4
00:00:07,000 --> 00:00:09,840
And in this video, we'll be discussing how

5
00:00:09,840 --> 00:00:12,320
the feature will be implemented on both the

6
00:00:12,320 --> 00:00:15,140
front-end and the back-end.

7
00:00:15,140 --> 00:00:17,340
Starting with the user interface, we'll need

8
00:00:17,340 --> 00:00:19,680
an image upload field for the user to

9
00:00:19,680 --> 00:00:22,120
upload the image of a landmark or feature

10
00:00:22,120 --> 00:00:25,540
they want in places they would like to visit.

11
00:00:25,540 --> 00:00:28,260
Below the File Upload widget, we'll have a

12
00:00:28,260 --> 00:00:30,360
section where the user will be able to preview

13
00:00:30,360 --> 00:00:32,780
a selected image.

14
00:00:32,780 --> 00:00:35,700
Next we'll have a set of checkboxes for the

15
00:00:35,700 --> 00:00:38,440
user's preferences, just as we did with the

16
00:00:38,440 --> 00:00:40,040
text search feature.

17
00:00:40,040 --> 00:00:42,440
The user will also be limited to 3 preferences

18
00:00:42,440 --> 00:00:45,100
that can be selected.

19
00:00:45,100 --> 00:00:47,360
Just as with the search feature, we'll also

20
00:00:47,360 --> 00:00:49,900
be displaying suggested destinations below

21
00:00:49,900 --> 00:00:50,960
the form.

22
00:00:50,960 --> 00:00:53,000
This will contain the same details as we have

23
00:00:53,000 --> 00:00:54,960
seen with the text search,

24
00:00:54,960 --> 00:00:57,560
and also a button to check the weather.

25
00:00:57,560 --> 00:01:00,080
Now let's discuss the backend.

26
00:01:00,080 --> 00:01:01,960
At the backend, we'll be receiving an image

27
00:01:01,960 --> 00:01:03,520
from our frontend form

28
00:01:03,520 --> 00:01:05,840
and requesting destination suggestions from

29
00:01:05,840 --> 00:01:07,180
Gemini.

30
00:01:07,180 --> 00:01:10,040
Our backend endpoint will be a POST endpoint,

31
00:01:10,040 --> 00:01:12,360
and we'll be giving the name

32
00:01:12,360 --> 00:01:15,360
API slash suggestByImage.

33
00:01:15,360 --> 00:01:17,200
This will take two parameters.

34
00:01:17,200 --> 00:01:19,520
a file parameter that is the image file data

35
00:01:19,520 --> 00:01:21,780
for the image uploaded by the user,

36
00:01:21,780 --> 00:01:24,420
and a preferences parameter, which is an optional

37
00:01:24,420 --> 00:01:27,700
list of comma-separated string of preferences.

38
00:01:27,700 --> 00:01:30,140
This endpoint will take this data, prepare

39
00:01:30,140 --> 00:01:31,560
it into a prompt,

40
00:01:31,560 --> 00:01:33,860
and send it over to the Gemini API.

41
00:01:33,860 --> 00:01:36,620
The image file will also be included along

42
00:01:36,620 --> 00:01:37,960
with the prompt.

43
00:01:37,960 --> 00:01:40,240
This is why we have selected Gemini 2.5

44
00:01:40,240 --> 00:01:44,560
Flash, a multimodal model, for our LLM operations.

45
00:01:44,560 --> 00:01:47,220
The API will then return a JSON output with

46
00:01:47,220 --> 00:01:49,120
all the data we need for the front-end to

47
00:01:49,120 --> 00:01:50,180
display.

48
00:01:50,180 --> 00:01:52,500
This result will be based on the image sent

49
00:01:52,500 --> 00:01:53,380
to the model

50
00:01:53,380 --> 00:01:55,660
and any other preferences the user may have

51
00:01:55,660 --> 00:01:56,960
submitted.

52
00:01:56,960 --> 00:01:58,540
So that's the plan.

53
00:01:58,540 --> 00:01:59,880
In the next video,

54
00:01:59,880 --> 00:02:02,060
we'll start developing this feature by building

55
00:02:02,060 --> 00:02:04,000
the back-end endpoint.

