WEBVTT

00:05.200 --> 00:06.890
I everyone welcome back.

00:07.080 --> 00:12.240
So we're going to be starting our Mungo at the Tauriel and we're going to be discussing all types of

00:12.660 --> 00:15.660
features and tools that we can use in the Mongo.

00:15.660 --> 00:17.790
DP So let's get started.

00:19.220 --> 00:21.620
So what is Mongo D-B.

00:21.770 --> 00:25.680
So mobility reads a cross-platform document oriented database.

00:25.680 --> 00:33.200
Right so if you remember we saw that a database is just somewhere that you can store data for for for

00:33.200 --> 00:34.270
a long term period.

00:34.280 --> 00:40.460
So if you're a member of the API examples the database is just something that we can put data in for

00:40.460 --> 00:43.280
a very long time and usually has records.

00:43.460 --> 00:51.740
And the one that has records is usually called RDBMS gate or sometimes it has other names but you know

00:51.760 --> 00:53.600
sometimes it's called Arjuna's.

00:53.780 --> 01:01.670
So it's a data so Mongar the left for document oriented database that provides high performance high

01:01.670 --> 01:04.250
availability and easy scalability.

01:04.250 --> 01:11.460
I mean would you be a works on a concept of collections and documents and read discuss in a further

01:11.510 --> 01:12.620
few slides.

01:12.930 --> 01:15.610
What is a collection What is a document.

01:15.890 --> 01:22.200
And we just want to make also some analogies if you're familiar with the normal RDM mast and we're going

01:22.200 --> 01:29.240
to see some analogies between collections and documents and that and so a database as we said is a physical

01:29.240 --> 01:35.770
container for collections where each database gets its own set of files of the file system.

01:36.260 --> 01:42.710
A single Mongo db server typically has multiple databases so if you think about a longer server you

01:42.710 --> 01:48.710
can draw a big box and then call this one a mongo D-B server.

01:48.800 --> 01:51.970
And usually this server might have different databases.

01:52.010 --> 01:56.220
I have the database one and database do.

01:56.440 --> 02:03.860
And then inside of each of these databases if I if I zoom in over here if I zoom in on this area over

02:03.860 --> 02:10.550
here then this oh this is not very zoomed in but if I if I try to zoom in on this database then it's

02:10.550 --> 02:12.790
made up of two things.

02:12.800 --> 02:16.700
It's made of collections and it's made of documents.

02:16.730 --> 02:22.300
As you said and we're going to discuss what each of these are in a few minutes.

02:22.340 --> 02:22.970
OK.

02:24.150 --> 02:32.070
Now a collection is a group of Mongo documents and it's equivalent of an RDBMS table.

02:32.190 --> 02:38.560
So that collection exists within a single table and within a single database of collections don't enforce

02:38.560 --> 02:42.560
the schema and documents within a collection can have different fields.

02:42.570 --> 02:47.490
Typically all documents in a collection are of similar or related purpose or this this me.

02:47.670 --> 02:50.840
So we said that we have a mongo server.

02:50.880 --> 02:51.410
Right.

02:51.570 --> 02:54.110
OK so let's draw the bigger picture.

02:54.150 --> 03:00.120
We have a mango D-B server and inside of that let's say we have only one database.

03:00.120 --> 03:01.830
But usually you can have more than one.

03:01.830 --> 03:08.760
So we have a bunch of databases and then inside this database we might have a bunch of collections so

03:09.000 --> 03:16.500
collection one collection to and inside of these collections we might have a bunch of documents so document

03:16.500 --> 03:19.870
one document to document three document four.

03:20.400 --> 03:26.400
And similarly in the second collection we might also have a bunch of different collections so we have

03:26.610 --> 03:31.490
documents excuse me to document one document two documents three in document four.

03:31.530 --> 03:37.410
So this is the big picture so election is just a group of Mago the documents and it's equivalent of

03:37.410 --> 03:39.370
an RDBMS stable.

03:39.480 --> 03:40.330
And we'll see.

03:40.380 --> 03:43.820
We'll discuss this and even more details in a few slides.

03:43.980 --> 03:49.860
So a collection exists within a single database and the collections do not enforce this schema.

03:49.860 --> 03:55.880
So in the normal case when you have that RDBMS table you usually have some sort of schema.

03:55.890 --> 04:02.610
So you say OK if I have a table in this database the first the first one here is going to be the ID.

04:02.730 --> 04:04.910
The second one is going to be something right.

04:04.920 --> 04:07.000
The third one is going to be something else.

04:07.000 --> 04:09.200
And the fourth line is going to be something else.

04:09.230 --> 04:17.250
I usually have a specified schema where where all the documents inside or all the rows are instead in

04:17.250 --> 04:21.650
this table must satisfy must have these properties right.

04:21.750 --> 04:28.080
Or else you're doing something wrong in the design of the table but in the case of Mongo NDB the really

04:28.080 --> 04:30.380
collections are not in force schema.

04:30.420 --> 04:38.580
CG even has some some documents here or rows which have an ID and then date of birth for example and

04:38.580 --> 04:43.570
then other documents which don't have date of birth but have for example gender.

04:43.680 --> 04:44.090
Right.

04:44.190 --> 04:50.430
So you can have documents that has some properties some of these Robert Fields what we call fields some

04:50.430 --> 04:55.510
fields and not other fields and it's fine to be in the same collection.

04:55.650 --> 05:02.990
So this is not the case for me the RDBMS tables but this is what it since it enforces schemas right.

05:02.990 --> 05:05.120
The whole idea of sequel's is key.

05:05.160 --> 05:05.660
Right.

05:05.850 --> 05:07.020
So.

05:07.320 --> 05:12.720
So in the case of collections you don't have to enforce any schema and the documents with it within

05:12.720 --> 05:15.070
a collection can have different fields right.

05:15.280 --> 05:19.850
And so typically all documents in a collection are similar or related.

05:19.890 --> 05:26.280
So even though they might not share all the fields but they have to relate to or be very similar to

05:26.280 --> 05:27.370
the same purpose right.

05:27.390 --> 05:31.190
They have to really be related or similar to the same purpose.

05:31.230 --> 05:32.090
OK.

05:32.490 --> 05:37.650
OK so what is a document so I document is a set of key value pairs.

05:37.830 --> 05:42.630
And really whenever you're create this word document here think of adjacent file and you're going to

05:42.630 --> 05:43.410
see why.

05:43.430 --> 05:45.150
Why I mean that in a few seconds.

05:45.150 --> 05:50.250
But whenever you see a document just instantly think about Jason files.

05:50.250 --> 05:55.920
So a document is a set of key value pairs and the documents have dynamic schema.

05:55.980 --> 06:01.650
OK so dynamic schema means that documents in the same collection do not need to have the same set of

06:01.650 --> 06:07.970
fields or structure and common fields in a collection documents may hold different types of data.

06:07.980 --> 06:15.600
So an example of a document is it starts with a curly brackets and then it has a key value pair so I'm

06:15.600 --> 06:24.940
going to say key one and then pair with a value so value 1 and then comma.

06:24.940 --> 06:32.590
So if you notice this is a very very similar to what it what to what we discussed in the song when we

06:32.590 --> 06:35.820
were discussing what Jason is this is very similar to Jason.

06:35.880 --> 06:42.760
In fact a document has a very very similar syntax to JCA and this is why we're using it in this in these

06:42.760 --> 06:48.970
projects because most of the times when we're dealing with the web where we're really sending and receiving

06:49.060 --> 06:56.460
lots of Jasons so it will be very very convenient if I can just put or dump this J-Zone into my Mongo

06:56.460 --> 07:01.270
D-B as a document and then retrieve it whenever I want to retrieve it later.

07:01.480 --> 07:06.910
So that's that's that's that's the main idea why we're using longer the B because it's very easy to

07:06.910 --> 07:11.590
just put some files into it or documents will save files.

07:11.590 --> 07:12.250
From now on.

07:12.340 --> 07:17.930
So put Jason documents inside of it and retrieve Jaison right away from it.

07:18.010 --> 07:21.170
So that's the motivation of what a document does.

07:22.740 --> 07:28.570
OK so let's discuss now the relation to a normal database so you might be familiar with normal.

07:28.640 --> 07:34.580
If you've ever played with access for Microsoft Access for example you might be familiar with the normal

07:34.610 --> 07:36.990
database the normal concept of a database.

07:37.190 --> 07:44.240
And then this database usually has a bunch of tables writes a table one table to say you have only two

07:44.240 --> 07:50.960
tables and then each table has a bunch of rows right where each row is is a new entry.

07:50.960 --> 07:56.330
So each of these tables have a neuro and then each column represents a different field.

07:56.330 --> 07:59.050
So for example this is name right.

07:59.210 --> 08:02.330
And then this is date of birth and so on.

08:02.330 --> 08:05.910
And then here we might have different write different names.

08:05.990 --> 08:11.950
So you also get in if you're familiar with normal databases you might be familiar with table joined

08:12.260 --> 08:14.050
and primary keys.

08:14.120 --> 08:17.570
So in the case of Mongo D-B this is very different.

08:17.570 --> 08:22.570
So for that database they're exactly the same right you have databases in both of them.

08:22.580 --> 08:22.810
Right.

08:22.830 --> 08:28.100
Cause they're just data Mieses but instead of having tables you have a collection right.

08:28.100 --> 08:31.360
So the collection that is is a group of documents right.

08:31.490 --> 08:35.730
So we know that a table is a group of rows.

08:35.870 --> 08:38.880
So a collection is a group of documents.

08:38.890 --> 08:42.040
Hopefully that makes a good analogy between the two.

08:42.200 --> 08:49.190
So you can just think of it you can rename the two if you may if you want to help you understand it

08:50.030 --> 08:51.880
in another way.

08:52.520 --> 08:59.770
Now the columns are called fields so the columns in our RDBMS is actually called the field in Mogo to

08:59.770 --> 09:04.190
the beat and table joins are actually called embedded documents.

09:04.190 --> 09:09.140
And we're going to see some examples of these embedded documents and how we can queery them and all

09:09.140 --> 09:10.210
that stuff.

09:10.490 --> 09:14.660
So final thing is the primary key the primary key in a table.

09:14.720 --> 09:19.580
If you're a member in the normal database the primary key is just one of the fields.

09:19.580 --> 09:26.180
Usually it's the first one so primary key I'll just name it P.K. and then the primary key is a unique

09:26.180 --> 09:27.870
key to each record.

09:28.040 --> 09:32.860
So for example you might the easiest type of prime rate is just to use numbers right.

09:32.930 --> 09:38.720
So the first drill would have a unique number of what the second row and having that unique number have

09:38.720 --> 09:40.690
to write a primary key or two.

09:40.850 --> 09:43.300
So these numbers are unique for each record.

09:43.340 --> 09:46.080
So you can think about it as a customer ID right.

09:46.160 --> 09:48.520
You can't have two customers.

09:48.560 --> 09:54.760
Right so you can have customer one customer to one customer id one to three for both of them right.

09:54.830 --> 09:56.660
Because they're not the same customers.

09:56.780 --> 10:02.840
So a primary care is just basically something unique to this record that nobody no other record can

10:02.840 --> 10:03.260
have.

10:03.320 --> 10:03.720
Right.

10:03.800 --> 10:08.720
So you can reference this record directly using this primary key over here.

10:09.140 --> 10:15.280
So in the case of RDBMS but in the case of Mangione we have a field right.

10:15.290 --> 10:19.710
We have a key called ID underscore ID right.

10:19.750 --> 10:26.840
Sharen like this and this is exactly the primary key and it's automatically given by Mongo D-B.

10:26.840 --> 10:33.860
Now you can specify it you can still longer to be had for this document over here give it this idea

10:33.890 --> 10:36.070
for example 1 2 3.

10:36.200 --> 10:37.510
Not exactly 1 2 3.

10:37.510 --> 10:39.240
There is a specific format.

10:39.320 --> 10:45.740
You can tell mongered Eby But if you don't tell it then it automatically gets a very unique ID that

10:45.740 --> 10:52.180
no other record or in the case of mongered need be no other document might have the same ID.

10:52.430 --> 10:59.480
So each document if you you may say has a different ID ID and you can't have two documents with the

10:59.480 --> 11:00.550
same ID.

11:00.560 --> 11:08.150
So really this idea is there is a big clear now let's see an example of a document.

11:08.150 --> 11:15.920
So here's an example of a document we have curly brackets first and then the first is the first one

11:15.920 --> 11:17.010
is the ID.

11:17.030 --> 11:17.330
Right.

11:17.330 --> 11:18.830
This is the one we discussed.

11:19.070 --> 11:27.860
And then this is usually just a unique description off of this object overview of this document over

11:27.860 --> 11:28.290
here.

11:28.370 --> 11:32.180
And no other document might have the same ID as this one.

11:33.440 --> 11:38.430
Now you see we already said that in documents we get to have key value peers.

11:38.750 --> 11:40.070
And so this is the case here.

11:40.070 --> 11:49.040
So we have a username is what do 3 x y Zed contact is a phone and then inside of contacts we can put

11:49.040 --> 11:50.450
another document right.

11:50.450 --> 11:54.850
So we call these embedded Sugg documents and inside this.

11:54.890 --> 12:02.730
So the value the value of the contact key is actually another document which has phone and e-mail.

12:02.740 --> 12:06.330
So no one is watching 3 4 5 6 7 8 9 0.

12:06.380 --> 12:14.000
And email is x y Zed and exampled outcome so this over your if I if I can highlight it this area over

12:14.000 --> 12:18.500
here this whole thing that I just highlighted is actually a document.

12:18.500 --> 12:23.050
So it's called the embedded some document because it's inside another document.

12:23.060 --> 12:29.780
So if you remember this is very similar to JS onsides when we discuss Jaison files we said we can embed

12:29.840 --> 12:38.800
adjacent file inside adjacent sizes so we can have X and then inside of it another Jason's right.

12:39.030 --> 12:45.170
So this is exactly the same idea and we call it in Mongo DBI and bits of documents.

12:45.180 --> 12:52.460
Now here's another embedded document we have access and then level 5 and group depth.

12:52.470 --> 12:54.960
So this year again this year.

12:54.960 --> 13:01.370
And don't forget of course the comma here that separates the elements comma here and comma here too.

13:01.470 --> 13:05.130
Just like Jason and that is so in Access.

13:05.280 --> 13:10.900
This object here is an embedded sub document to the key axis.

13:10.930 --> 13:11.540
Right.

13:11.550 --> 13:18.630
So we can describe this document as a document with this ID and that has to embedded some documents

13:18.870 --> 13:25.580
that correspond to the contact key and the access key.

13:25.580 --> 13:32.870
Now it's worth ENTRAR it's worth to talk a bit about the ID field since it's not really discussed a

13:32.870 --> 13:40.650
lot later but it's it's one of the most important things about mongered be in and why it's unique.

13:40.760 --> 13:48.320
So these are 12 byte hexadecimal number which assures the uniqueness of every document as we see it.

13:48.550 --> 13:54.210
And so you could provide either you while inserting the document if you don't provide the.

13:54.380 --> 13:59.270
If you don't provide that the mongered DV provides a unique ID for every document.

13:59.390 --> 14:03.300
So you're allowed to give your own ID right.

14:03.590 --> 14:07.450
On the condition that it hasn't been used in any other document.

14:07.970 --> 14:13.760
But if you if you don't get it right if you just create a new document and you don't provide an ID then

14:13.760 --> 14:19.600
mongered automatically assigns you a unique ID for every single document.

14:19.610 --> 14:23.470
Now the ID field is made up of 12 bytes.

14:23.630 --> 14:26.740
The first four bytes is for the current time stamp.

14:26.740 --> 14:37.910
So for example let's say you created this object at time 12 25 43 then this this freris for Bies would

14:37.910 --> 14:43.990
have the current time stamp at which you created this option right and with a date and and so on.

14:44.180 --> 14:50.850
So it would have the current step then the next three bytes will be for the machine ID.

14:50.960 --> 14:53.700
So what machine created this document.

14:53.960 --> 15:00.170
And then the next to buy a before the process idea of the Mongo DB server and then the last three bytes

15:00.170 --> 15:02.910
are just simple incremental values right.

15:03.140 --> 15:07.950
So you take all of these and you merge them together and you get your ID field.

15:10.410 --> 15:16.690
Now we also need to discuss a few other things like the Mongo D-B versus relational databases.

15:16.830 --> 15:23.640
So any relational database has a typical schema design that shows a number of tables and the relationships

15:23.640 --> 15:28.770
between these tables while in Mongo D-B there is no concept of relationship.

15:28.770 --> 15:35.850
So this is a very very very important in normal databases you might have for example two tables for

15:35.850 --> 15:40.580
example customer and product sold.

15:40.590 --> 15:40.860
Right.

15:40.860 --> 15:48.780
So you might have two have them products sold and so customers you would have a primary key let's say

15:48.960 --> 15:49.900
just an integer.

15:49.920 --> 15:53.560
So customers want custom into custom custom before.

15:53.760 --> 15:56.800
And so for example customer 1.

15:57.390 --> 16:00.860
Let's say he his name is something right.

16:00.870 --> 16:02.490
And his last name is something.

16:02.490 --> 16:03.060
And so on.

16:03.060 --> 16:05.240
So these are all fields.

16:05.490 --> 16:06.820
And so in the product.

16:06.830 --> 16:13.160
So a table I might have for example a primary key for products right.

16:13.320 --> 16:17.090
For example this product and then pre-Chase or.

16:17.130 --> 16:20.980
Or the person who bought it as another field.

16:21.060 --> 16:27.540
And then this would point to some customer who bought as the prime rate of some customers for example.

16:27.540 --> 16:30.440
Customer 3 bought the first product.

16:30.540 --> 16:37.260
So we can point from here and there's a relationship between this table over here this product old and

16:37.260 --> 16:41.120
the customer and there's a relationship between these two tables.

16:41.160 --> 16:48.380
This is unusual to databases relational databases but in here there is no concept of relationships.

16:48.390 --> 16:54.740
So if you're wanting corporate relationships you have to have to use subduct some documents.

16:54.750 --> 16:56.390
As we saw in a previous life.

16:56.400 --> 17:03.390
So in here for example then this is showing you can have this as a second table and treat and then this

17:03.390 --> 17:07.970
is showing that you're associating this key was this entry over here.

17:07.980 --> 17:08.760
Right.

17:09.210 --> 17:15.020
So so so this is a very important distinction between Mungo IDB and relational database.

17:15.030 --> 17:19.720
There is no idea of relationship and Mongo D-B.

17:19.870 --> 17:25.300
Now what are the advantages of longer IDB over just RDBMS in general.

17:25.330 --> 17:28.060
And so there are a few a few advantages.

17:28.060 --> 17:35.430
The first is of course it's keyless schema less so Mongul Gibbie's a document database in which one

17:35.440 --> 17:42.640
collection holds different documents and the number of fields content and size of the document can differ

17:42.640 --> 17:45.160
from one document to another.

17:45.160 --> 17:51.310
Now this is not the same in usual in normal tables as we discussed because tables are already fixed

17:51.320 --> 17:51.610
right.

17:51.610 --> 17:54.370
They have to have all these fields over here.

17:54.370 --> 17:56.030
Now some of them might be now.

17:56.050 --> 17:56.360
Right.

17:56.350 --> 18:03.490
If you if you're familiar with with take with the regular databases some of them might be no but that

18:03.490 --> 18:08.380
doesn't change the fact that you have to have all these fields in any given record.

18:08.500 --> 18:09.880
But this is not the same.

18:09.940 --> 18:11.730
So this is what we call a schema.

18:11.770 --> 18:12.230
Right.

18:12.340 --> 18:19.840
But this is not the same thing in the case of NDB you can have a collection or which is analogous to

18:19.900 --> 18:25.510
a table and it can have documents with a different number of fields content size.

18:25.600 --> 18:27.800
And you know lots of different stuff.

18:27.930 --> 18:34.280
And so also the structure of a single object is clear and there is no complex joins.

18:34.720 --> 18:41.260
And it has a deep query ability so much be supports dynamic queries on documents using a document based

18:41.260 --> 18:45.410
query language that's nearly as powerful as as fuel.

18:45.490 --> 18:47.070
So this is very strong.

18:47.150 --> 18:54.610
The queen you would think that what you know the schemas here are made to make searches faster but in

18:54.610 --> 19:01.740
fact Magdi me even though it's schema less it's actually very very fast in queering.

19:01.840 --> 19:06.010
In fact it's as almost as fast as Eski well.

19:06.160 --> 19:13.230
So this is you know big Also that tuning so you can tune the longer NDB a lot.

19:13.360 --> 19:18.730
And it's easy to scale out mongered Phoebes among LADEE is super super easy to scale up.

19:18.730 --> 19:22.690
So let's say you're you're servicing only a bunch of customers today.

19:22.810 --> 19:30.130
Well in the next the next few if your customers for example suddenly multiply by 100 right so you go

19:30.130 --> 19:31.620
viral for example.

19:31.720 --> 19:34.370
The longer be super super easy to scale.

19:34.390 --> 19:34.630
Right.

19:34.630 --> 19:37.850
So it supports scaling very easily.

19:37.930 --> 19:44.800
And so conversion and mapping of application objects to database objects is not needed and uses internal

19:44.800 --> 19:50.320
memory for storing the windowed working set enabling faster access of data.

19:50.320 --> 19:54.080
So this is a very important point to.

19:54.180 --> 19:57.040
Now why should we use Mago D-B Zo.

19:57.210 --> 20:03.990
So first of all it's a document oriented storage so the data is stored in the form of a Jason style

20:04.020 --> 20:07.220
document and the index attributes.

20:07.240 --> 20:13.790
There's an index on any attribute you can add replication and high availability so it doesn't fill all

20:13.810 --> 20:15.090
that all too much.

20:15.090 --> 20:18.580
You can have auto shorting and we'll talk about that later.

20:18.600 --> 20:24.600
We have reached queries you can you can really formulate some very very complex queries.

20:24.720 --> 20:26.960
There is fast in place updates.

20:26.970 --> 20:28.290
So this is also fast.

20:28.310 --> 20:29.600
Oops let's go back.

20:29.700 --> 20:31.260
So very fast updates.

20:31.410 --> 20:36.020
And then also professional support by Mongo D-B that is almost 24/7.

20:36.030 --> 20:37.530
So this is pretty cool.

20:39.490 --> 20:44.100
Now let's see freek'n OK here we go.

20:44.100 --> 20:46.460
So where do you use mangosteen.

20:46.470 --> 20:51.890
So when should we use with me instead of a sequel for example a sequel database.

20:52.020 --> 20:57.930
So in the case of big data this is automatically you know if you're if you're dealing with big data.

20:57.930 --> 20:59.960
Definitely go for Mongo db.

21:00.120 --> 21:01.370
It's killable.

21:01.440 --> 21:09.540
It's fast and you know it's it's just as good as any other option out there if not better.

21:09.540 --> 21:16.100
Also the content management and delivery mobile and social infrastructure user data management and data.

21:16.110 --> 21:23.550
So if you have any of these then it will be a very good idea to use Mago de.

21:23.590 --> 21:25.690
OK so we're going to stop here for now.

21:25.720 --> 21:31.450
And in the next video we're going to see how we can install Mungiki be on our virtual machine.

21:31.540 --> 21:34.270
So until the next video happy coding.
