WEBVTT

0
00:00.410 --> 00:04.850
I want to try to get through this lecture as quickly as possible, because there's a lot I want to cover.

1
00:04.880 --> 00:06.230
I want us to go back.

2
00:06.230 --> 00:11.270
Remember, we had that form with Wally Warthog and he submitted a password of secret1 and it produced

3
00:11.270 --> 00:12.650
this URL.

4
00:12.920 --> 00:17.870
This is where it all began, right when we started thinking about encoding and why we see these strange

5
00:17.870 --> 00:18.710
characters.

6
00:19.350 --> 00:23.490
Well, the question mark, the equals, the ampersands ...

7
00:24.570 --> 00:29.910
these characters are reserved characters and that's why we can see them in the URL.

8
00:29.940 --> 00:31.860
We've been through this in previous lectures.

9
00:31.860 --> 00:32.760
We know this.

10
00:32.880 --> 00:36.060
But what about that plus sign?

11
00:36.210 --> 00:38.010
Why is that there?

12
00:38.370 --> 00:47.580
Well, remember that URLs can't have spaces aka they are unsafe. And a space in hex is defined as

13
00:47.580 --> 00:48.420
%20.

14
00:48.450 --> 00:53.940
So why then does the URL have a + and not a %20?

15
00:54.600 --> 00:57.300
Well, that is a very, very good question.

16
00:58.020 --> 01:04.890
Firstly, you need to understand that the encoding used by your browser by default is based on a very

17
01:04.890 --> 01:12.960
early version of the URI percent encoding rules. And over the years, it's modified these slightly.

18
01:13.730 --> 01:21.890
One of these modifications was using a + symbol to represent spaces instead of %20.

19
01:22.610 --> 01:26.540
But this has caused a lot of debate and a lot of confusion.

20
01:27.140 --> 01:28.220
Let me explain.

21
01:28.250 --> 01:34.550
It turns out how the character is encoded depends on where it is in the URL.

22
01:34.580 --> 01:36.560
I know it sounds weird, right?

23
01:36.590 --> 01:37.510
Let me explain.

24
01:37.520 --> 01:46.970
The old spec, I'm talking HTML 2.0, said that space characters could be encoded as a + in the key-value pairs

25
01:46.970 --> 01:48.290
in a query string.

26
01:48.320 --> 01:50.560
Remember the query string part of a URL?

27
01:50.570 --> 01:52.490
We covered it in a few lectures back.

28
01:52.520 --> 01:54.560
It's after that question mark, remember?

29
01:55.390 --> 02:02.290
So this means that only after the question mark, aka inside the query string, spaces can be replaced

30
02:02.290 --> 02:03.340
by a +.

31
02:03.520 --> 02:10.270
But this is according to HTML 2.0, and we know that HTML 2.0 is not the latest spec.

32
02:10.660 --> 02:20.080
So then comes along HTML5 and that spec updated how a URL should be encoded. And how did HTML5 deal with

33
02:20.080 --> 02:20.860
this issue?

34
02:20.950 --> 02:30.610
The use of %20 to encode a space in a URL is explicitly defined in HTML5, but nothing's

35
02:30.610 --> 02:35.440
really mentioned of the +, which causes a lot of confusion.

36
02:35.470 --> 02:44.140
However, the latest specs do continue to define + as legal in the "application/x-www-form-urlencoded"

37
02:44.140 --> 02:52.100
content type. But these latest specs don't explicitly state that query strings should have the +.

38
02:52.120 --> 03:00.380
So they've kind of mentioned the + in other areas of the browser, but not when it comes to URL encoding.

39
03:00.380 --> 03:08.270
It's very, very confusing, but HTML5 does explicitly mention the %20 in the URL encoding.

40
03:08.630 --> 03:14.870
So the burning question is ... should you use a + to replace spaces or %20?

41
03:14.900 --> 03:20.570
If you use both, it just means that Wally Warthog will be encoded differently in the path and the query

42
03:20.570 --> 03:22.920
parts of the URL.

43
03:22.940 --> 03:24.800
Well, let's assume you use both.

44
03:24.800 --> 03:31.760
And this means that Wally Warthog can or may or potentially will be encoded differently in the path

45
03:31.760 --> 03:34.070
and query parts of a URL.

46
03:34.370 --> 03:35.300
What do I mean?

47
03:35.330 --> 03:40.730
Well, let's have this URL here and let's include Wally Warthog in two different places in the path

48
03:40.730 --> 03:43.370
of the URL, and also as a query string.

49
03:43.370 --> 03:49.850
So what I'm saying is the %20 must always be %20 in the actual path.

50
03:50.240 --> 03:53.570
But when it comes to the query string, we've got a +.

51
03:53.570 --> 03:59.740
And remember the query string is just defined as all the values after this question mark.

52
03:59.740 --> 04:02.800
So if you had a URL like this, is it a problem?

53
04:02.890 --> 04:04.210
Well, no, not really.

54
04:04.210 --> 04:11.050
If your URL encodes the + to a %20 or vice versa, it's going to work just the same.

55
04:11.050 --> 04:17.320
There's a lot of legacy code which has +'s in the query string part, and there's still a lot of

56
04:17.320 --> 04:20.410
code that generates a + in the query string part.

57
04:20.830 --> 04:26.320
So the odds are, that you're going to be breaking nothing by using one over the other.

58
04:26.830 --> 04:27.580
Got it?

59
04:30.560 --> 04:31.070
Great. 

60
04:31.070 --> 04:37.760
Bottom line, URLs are technically covered entirely by the latest specs, which explicitly state that

61
04:37.760 --> 04:40.580
spaces ought to be converted to %20.

62
04:40.760 --> 04:45.950
So my recommendation is why don't we just always use %20 where we can?

63
04:45.980 --> 04:47.600
It's just better.

64
04:47.600 --> 04:51.260
It's just conforming to the latest specs.

65
04:51.710 --> 04:52.310
"Okay, Clyde.

66
04:52.340 --> 04:57.890
That's all good and well, but what happens if, you know my encoding software in the background doesn't

67
04:57.890 --> 04:59.990
convert my space to

68
04:59.990 --> 05:00.740
%20?

69
05:01.100 --> 05:05.720
As an example, we saw Gerald the Giraffe was converted to pluses.

70
05:07.150 --> 05:14.110
Well, the good news is that JavaScript and other languages all have functions that can be used to URL encode

71
05:14.110 --> 05:14.800
a string.

72
05:15.010 --> 05:25.230
For example, in JavaScript we can use the encodeURIComponent() method, and this encodes all spaces as

73
05:25.240 --> 05:25.930
%20.

74
05:25.960 --> 05:29.560
Let me hop over quickly to the console, and show you an example.