WEBVTT

00:00.140 --> 00:00.710
Now.

00:00.710 --> 00:07.850
In the last lesson, we looked at various ways of finding and locating elements on a particular HTML

00:07.850 --> 00:08.780
page.

00:08.810 --> 00:14.240
Now, in this lesson, we're going to put all of that to practice, and you're going to get hold of

00:14.240 --> 00:16.430
all of these upcoming events.

00:16.460 --> 00:22.190
Now, because these events are time dependent, the events that you'll see will, of course, be different

00:22.190 --> 00:23.750
from what I've got here.

00:23.780 --> 00:27.710
But the idea is to get hold of all of these dates.

00:27.710 --> 00:32.450
So all five of these, and then get hold of all five of these names.

00:32.450 --> 00:35.720
And we're going to create a dictionary from these events.

00:35.720 --> 00:41.540
And by the end of the challenge, you should be able to print out a dictionary that structured like

00:41.540 --> 00:42.170
this.

00:42.200 --> 00:47.480
It's going to contain five items starting from zero.

00:47.480 --> 00:49.130
So this is the first key.

00:49.130 --> 00:55.640
And then the first value is a dictionary with a key of time and a key of name.

00:55.640 --> 01:01.940
And then the values corresponds to of course the first date and the first name.

01:01.940 --> 01:06.100
So the first one we've got here is PyCon Japan 2020.

01:06.100 --> 01:10.930
And you can see down here in this section upcoming events.

01:10.930 --> 01:13.300
The first one is this item.

01:13.300 --> 01:19.990
So we're basically converting whatever is here in the upcoming events into this dictionary format.

01:20.110 --> 01:21.730
That is the goal.

01:21.730 --> 01:25.900
And you're going to be needing everything that you've learned previously.

01:25.900 --> 01:32.740
In addition, it might be worth taking a look at the documentation for locating elements with selenium.

01:32.740 --> 01:36.580
So I'll link to this page in the course resources as well.

01:36.580 --> 01:43.990
And once you're ready, get inspecting and see if you can complete this challenge and print out this

01:43.990 --> 01:44.980
dictionary.

01:45.010 --> 01:46.450
Pause the video now.

01:52.870 --> 02:01.630
So our goal is to get hold of this piece of data and this piece of data, but just the text we're going

02:01.660 --> 02:05.640
to have to first figure out how to locate these items.

02:05.640 --> 02:13.770
So if I go ahead and inspect on this date, you can see that it's inside a HTML element called time.

02:14.130 --> 02:20.850
Now, in order to get hold of this element time, we can of course use an XPath.

02:20.850 --> 02:26.460
But the problem is that this XPath will be specific for this first item.

02:26.460 --> 02:35.910
While we actually probably want to use a find method that can find all of the dates and all of the event

02:35.940 --> 02:36.630
names.

02:36.630 --> 02:40.650
So we're going to be using one of these multiple element finds.

02:40.650 --> 02:43.320
So where it says find elements.

02:43.560 --> 02:51.060
Now I think the easiest way here at least for me, is to think about it in terms of CSS selectors.

02:51.090 --> 03:00.660
This is a time element that lives inside a Li, which is inside a UL, but we still haven't found anything

03:00.660 --> 03:04.140
that's unique to this particular structure here.

03:04.140 --> 03:08.500
Because if you take a look over here when we look at the latest news.

03:08.500 --> 03:11.950
It's also a time inside an ally inside a UL.

03:11.950 --> 03:14.170
And none of this is unique.

03:14.200 --> 03:18.940
Until we get to this div where we have a blog widget.

03:18.940 --> 03:23.620
While here we've got a event widget.

03:23.620 --> 03:27.160
So there finally is a unique class name.

03:27.220 --> 03:32.710
We're going to use this class name and then find the time element.

03:35.050 --> 03:43.150
Back in our code let's go ahead and tap into our driver and then use the find element by CSS selector,

03:43.150 --> 03:47.230
which is going to give us a list of elements that match this selector.

03:47.230 --> 03:52.990
And the selector is going to be first find a div with this particular class.

03:52.990 --> 03:56.290
So it has to have the class event widget.

03:56.290 --> 03:58.570
So we'll write dot event widget.

03:58.570 --> 04:05.470
And then after space we specify the next thing we want to drill down to which is a time element.

04:05.470 --> 04:09.330
So let's go ahead and just put the name of the HTML element like this.

04:09.330 --> 04:14.130
And we'll get the event times as a list, hopefully.

04:14.130 --> 04:16.350
So let's go ahead and print this out.

04:16.380 --> 04:17.820
Event times.

04:17.820 --> 04:25.890
And because this is actually going to be a selenium object rather than the actual text, we'll need

04:25.890 --> 04:29.700
to use a for loop in order to actually see what it is.

04:29.700 --> 04:37.110
So for time in event times, let's go ahead and print each of the time dot text.

04:40.080 --> 04:47.310
So once selenium has done its thing, you can see it's now got hold of all five dates.

04:48.510 --> 04:54.150
The next thing we need to do is to get hold of the event names.

04:55.020 --> 04:57.450
And we're going to use a similar method.

04:57.450 --> 05:00.870
So let's go ahead and inspect on the name.

05:00.870 --> 05:04.920
And you can see that this is now a anchor tag.

05:04.950 --> 05:12.500
Now you might think that the solution will be just as simple as copying what we had before and replacing

05:12.500 --> 05:16.250
the time element with an anchor tag element.

05:16.250 --> 05:25.160
But you'll see as I write my for loop forname in event names print name dot text.

05:25.340 --> 05:34.370
This does not actually get us what we want, because it also gives us the first anchor tag in that div

05:34.370 --> 05:38.840
with class name event widget, which is this more link here.

05:38.840 --> 05:44.000
So if we don't want that more link, we're going to have to be a little bit more creative.

05:44.030 --> 05:48.110
This anchor tag is also inside an Li.

05:48.140 --> 05:56.870
While that more link is definitely not inside an Li, so we can narrow down on our selector by saying

05:56.900 --> 06:04.370
okay, so it's inside a element with class event widget, but then it's inside an Li and then it's inside

06:04.370 --> 06:05.540
an anchor tag.

06:05.810 --> 06:13.090
Now that I've updated that CSS selector, you can see when I hit print it gets us the actual names of

06:13.090 --> 06:14.770
all of the conferences.

06:14.830 --> 06:20.530
So the final thing to do is to actually create our events dictionary.

06:21.460 --> 06:26.860
You could do this using dictionary comprehension, but I'm going to do it in a slightly more long form

06:26.860 --> 06:32.290
way, just so that anybody who's a little bit confused will make it a little bit easier to understand.

06:32.320 --> 06:35.530
We're going to create a for loop and I'm going to use N.

06:35.530 --> 06:37.930
So I'm going to say for n in range.

06:37.930 --> 06:43.660
And the range is going to be from zero to the length of event times.

06:43.660 --> 06:47.170
So it's basically going to be a range from 0 to 4.

06:47.290 --> 06:52.150
Once I've got that range now I'm going to add to my event.

06:52.360 --> 06:55.540
The key of the event is the actual n.

06:55.540 --> 07:04.150
So the number and then the value of the event is a dictionary with the key of time, and also a key

07:04.180 --> 07:07.990
of name like this.

07:09.130 --> 07:17.240
The time is going to be from the event times and then getting hold of the item at index n and the name

07:17.240 --> 07:22.460
is going to be event names and the item at index n.

07:24.230 --> 07:30.710
The final thing we need to do is this gets hold of a selenium object, and we have to get hold of the

07:30.710 --> 07:31.910
actual text.

07:31.910 --> 07:34.040
So let's write dot text.

07:35.000 --> 07:38.270
And now we can print our events dictionary.

07:40.760 --> 07:45.980
And once that's done you can see this is in the exact format that we wanted.

07:46.010 --> 07:48.560
We've got dictionary with five items.

07:48.560 --> 07:51.620
Each item has a dictionary in itself.

07:51.650 --> 07:56.000
Time and name of the upcoming Python conferences.

07:56.840 --> 07:59.360
So did you manage to complete that challenge?

07:59.390 --> 08:06.170
If not, it might be worth either reviewing CSS selectors, which we went through in previous lessons,

08:06.170 --> 08:12.800
or reviewing some of the lessons previously where we discussed how to locate and how to get hold of

08:12.800 --> 08:15.950
the text from elements using selenium.
