WEBVTT

00:00.230 --> 00:04.340
Hey everyone, and welcome to the Knowledge Portal Video series.

00:04.340 --> 00:11.030
So today we'll be speaking about an important topic which basically rules our Internet life behind the

00:11.030 --> 00:11.660
scenes.

00:11.660 --> 00:14.660
And the topic name is called as caching.

00:14.660 --> 00:19.010
Now, I'm very sure that most of you might have heard about this word.

00:19.010 --> 00:27.830
Caching and caching proves to be a great importance as far as the Http protocol and browsing is concerned.

00:27.860 --> 00:32.510
So let's go ahead and understand the basics about caching.

00:34.160 --> 00:41.240
So with my favorite example that we take every time you have a client and you have a web server.

00:41.570 --> 00:44.450
So a client basically sends a request.

00:44.450 --> 00:53.900
So in our case it is get request on my C dot PNG and the web server responds back with the.

00:55.140 --> 00:55.890
PNG file.

00:56.370 --> 01:00.930
Now this is a very simple Http request and Http response.

01:00.960 --> 01:09.000
Now the problem that arises over here is what happens if the client sends the same get request ten times

01:09.000 --> 01:11.380
like he keeps refreshing the page.

01:11.400 --> 01:20.190
So this get request will be sent to the server ten times and the server will have to process and send

01:20.190 --> 01:22.050
the response ten times again.

01:23.030 --> 01:30.830
And basically this might not seem to be a big issue, but when you talk about big websites like Facebook,

01:30.860 --> 01:39.200
Amazon or any big websites you can consider, they have like millions of users who visit their website

01:39.200 --> 01:39.830
every day.

01:39.830 --> 01:44.840
And most of us, we refresh the page like, how many of you open Facebook?

01:44.870 --> 01:48.610
Like many of you open like ten times a day, 15 times a day.

01:48.620 --> 01:56.420
And basically what happens, like there are certain static contents like your DP picture or DP of your

01:56.420 --> 01:59.060
friends that will not change often.

01:59.060 --> 02:06.890
So the more you open, the more same requests will be sent to the web server and the more web server

02:06.890 --> 02:09.410
will have to process the same response back.

02:09.410 --> 02:12.550
So this is where the problem arises.

02:12.560 --> 02:19.190
And in order to solve this, there needs to be a basic caching mechanism and this is something which

02:19.190 --> 02:20.500
is implemented.

02:20.510 --> 02:30.360
So what happens in cache mechanism is there is a middleware server, so this middleware server is responsible

02:30.360 --> 02:31.620
for caching.

02:31.620 --> 02:38.910
So now what happens is the client sends a request, The request.

02:39.500 --> 02:44.780
Is forwarded to the web server with a cache in middle.

02:45.230 --> 02:51.740
Now in the response, the web server responds back to the cache and the cache.

02:53.290 --> 02:57.130
Sends the response back to the client.

02:57.250 --> 03:02.380
Now let's look it in a Http way.

03:02.740 --> 03:09.700
So the client sends a request call as get on my dot PNG.

03:09.970 --> 03:13.210
The request is received on the cache server.

03:13.540 --> 03:18.640
Cache server will send the same request to the web server.

03:18.670 --> 03:27.910
Now since this is a get request on my c dot PNG, the web server will respond back with a file.

03:27.910 --> 03:31.030
So this is the my c dot png file.

03:31.480 --> 03:37.510
Now what cache server will do is cache server has its own storage.

03:38.290 --> 03:41.080
So what cache server will do?

03:41.110 --> 03:50.530
Cache server will store this file my c dot png within this storage and then it will reply back to the

03:50.530 --> 03:52.310
client with the file.

03:53.120 --> 03:54.680
Okay, so.

03:55.660 --> 04:05.050
Now, what would happen is if the client decides to refresh the page, then what would happen is or

04:05.050 --> 04:07.330
if there are multiple same requests.

04:07.360 --> 04:12.830
Now, we already know that the cache server has stored the MC dot PNG file.

04:12.880 --> 04:20.070
In that case, if the user decides to refresh the page, you have, you have get on MC dot PNG.

04:20.080 --> 04:23.170
So user is sending the same request again.

04:23.200 --> 04:25.510
So now what cache server will do?

04:25.540 --> 04:33.440
Cache server knows that there is already a file called as MC dot PNG within its local store.

04:33.460 --> 04:35.200
So the.

04:36.460 --> 04:40.070
Castro will respond with the same file.

04:40.090 --> 04:48.610
So in this case, what happened was the cache server did not send the request to the client or to the

04:48.610 --> 04:49.590
web server.

04:49.600 --> 04:52.900
It responded back with the local cache.

04:53.650 --> 05:02.950
Now, this specific local cache can be a hard disk drive or it can be a memory also to fasten the things

05:02.950 --> 05:03.310
up.

05:03.310 --> 05:06.290
So let's look into how that would work.

05:06.310 --> 05:15.160
Now this local cache can be a dedicated server, which an organization can have, or it can be a software

05:15.160 --> 05:16.570
based appliance also.

05:16.570 --> 05:20.260
So most of the browsers does have local cache.

05:20.260 --> 05:25.450
So this is mostly a software appliance as far as the browsers are concerned.

05:25.540 --> 05:27.910
So let's look into how that would work.

05:29.930 --> 05:33.500
So I have a Firefox browser.

05:33.500 --> 05:40.460
So if you go into about colon cache, this will basically tell you all the caching related information

05:40.460 --> 05:42.380
which the browser keeps.

05:42.380 --> 05:46.700
So you can consider this as a software appliance of the browser.

05:46.820 --> 05:50.510
Now there are two important things to remember is.

05:51.530 --> 05:57.920
That the cash can be stored in memory and the cash can be stored in disk.

05:57.950 --> 06:03.480
Now, if you store it in memory, then definitely the it will be much more faster.

06:03.500 --> 06:04.460
However.

06:05.820 --> 06:09.870
Bisque is also quite fast, but not as fast as memory.

06:09.870 --> 06:17.370
But overall, retrieving files from disk is much, much more faster than actually contacting the web

06:17.370 --> 06:19.500
server and fetching the files.

06:19.620 --> 06:26.280
Now, if you look into the disk, it basically tells you how many entries are part of the cache.

06:26.310 --> 06:32.940
It also tells you about the storage and the storage location where the cache is stored.

06:33.150 --> 06:40.200
So now if you go to list cache entries over here, it will basically tells you what are the files which

06:40.200 --> 06:41.730
are being cached over here.

06:44.140 --> 06:54.010
Now, along with that it tells you about let's show it talks about the fetch count basically tells on

06:54.010 --> 06:57.100
how many times the item has been fetched.

06:57.130 --> 07:03.370
It talks about the expiry time of that specific cache value.

07:03.580 --> 07:10.240
So if you look into this specific entry, so this is something related to a PNG file.

07:10.330 --> 07:13.870
Stars dot PNG, which is a 1 to 73 bytes.

07:13.990 --> 07:16.780
It has been fetched two times in total.

07:16.780 --> 07:18.510
And for how many time?

07:18.520 --> 07:20.500
How much time it will be cached.

07:20.590 --> 07:30.310
So it has the first retrieval time was 2410, 2017 and it has been actually cached for a good amount

07:30.310 --> 07:30.850
of time.

07:30.850 --> 07:31.480
You see.

07:31.900 --> 07:40.150
11 one 2027 That is actually an amazing caching amount of time.

07:40.150 --> 07:48.590
So generally files are cached for like one year, but this seems to be quite good enough anyways.

07:48.590 --> 07:51.110
So let's do one thing.

07:52.580 --> 07:55.670
What we'll be doing is let's open a file.

07:56.030 --> 08:01.220
So Dexter Labs dot in, slash my c dot jpg.

08:01.610 --> 08:09.530
So this is basically a image file which is retrieved from Dexter Labs dot slash my C.

08:09.680 --> 08:16.910
So now if I keep refreshing the page question is should a browser send a request to the web server every

08:16.910 --> 08:20.380
time and expect a response that would be much more slower?

08:20.390 --> 08:27.530
So in this case, what the browser will do, a browser will basically.

08:29.430 --> 08:31.560
Cached image file.

08:32.310 --> 08:38.400
So now what would happen is if you look here, the fetch count is 13.

08:38.400 --> 08:41.160
So this basically means that.

08:42.190 --> 08:50.310
This file is already cached and it has been fetched for 13 times in total.

08:50.320 --> 08:56.050
It talks about the last modified time and it talks about the expires time.

08:56.050 --> 09:00.280
Like when will this file expire from cache?

09:00.940 --> 09:10.210
So next time, even when you refresh the page, the chances are that it will be retrieved from the browser

09:10.210 --> 09:17.170
cache and the get request will not be sent to the server directly.

09:17.170 --> 09:25.960
So this is it about the high level overview about what the caching is all about in Http protocol.

09:25.960 --> 09:29.830
Now, definitely one last thing that caching does bring.

09:29.830 --> 09:33.970
If you note it does bring a good amount of benefits.

09:33.970 --> 09:40.360
Now one of the very good benefits is that it reduces the overhead of the server resources.

09:40.390 --> 09:47.180
Now, since the browser is retrieving the file from the cache, the server, the web server does not

09:47.180 --> 09:50.930
have to process the response and send it back to the client.

09:50.930 --> 09:54.110
So it basically reduces the overall overhead.

09:54.470 --> 09:57.920
Second is, it decreases the network bandwidth.

09:57.920 --> 10:06.260
Now every time a user refreshes the page and if the request is sent to the web server, then the overall

10:06.260 --> 10:09.260
network bandwidth will be tremendously high.

10:09.260 --> 10:18.920
And since now a browser retrieves it from the local cache, the network is not congested and third pages

10:18.920 --> 10:23.060
are loaded much more faster since the entries are cached within the.

10:24.780 --> 10:27.260
Desktop or within the server itself.

10:27.270 --> 10:30.870
So these are some of the benefits of caching.

10:30.870 --> 10:38.610
In the upcoming lectures, we will be speaking in great detail related to caching and caching related

10:38.610 --> 10:39.330
headers.

10:40.080 --> 10:42.480
So this is it about this lecture.

10:42.480 --> 10:48.300
If you have any doubts suggestions, feel free to connect us at Twitter, Facebook or LinkedIn or mail

10:48.300 --> 10:50.580
us at instructors at KP Labs.

10:51.030 --> 10:52.320
Thanks for watching.
