1
00:00:02,520 --> 00:00:03,930
So what's the idea behind

2
00:00:03,930 --> 00:00:07,090
NoSQL database systems again?

3
00:00:07,090 --> 00:00:09,220
Well, the idea in the end is that

4
00:00:09,220 --> 00:00:11,960
we can store data without having to focus

5
00:00:11,960 --> 00:00:15,560
on strict schemas or data structures.

6
00:00:15,560 --> 00:00:18,900
We don't have to define the structure

7
00:00:18,900 --> 00:00:20,990
of a table ahead of time.

8
00:00:20,990 --> 00:00:25,010
We don't have to define the number or types of columns

9
00:00:25,010 --> 00:00:29,610
and we don't have to work with dozens or hundreds of tables

10
00:00:29,610 --> 00:00:34,143
to map out relationships between normalized data records.

11
00:00:35,090 --> 00:00:36,500
Instead, the idea is that

12
00:00:36,500 --> 00:00:39,490
we work with so-called collections,

13
00:00:39,490 --> 00:00:43,340
which are a bit like tables, but unlike tables,

14
00:00:43,340 --> 00:00:46,100
they don't have a fixed structure.

15
00:00:46,100 --> 00:00:49,940
Instead, collections are basically data containers,

16
00:00:49,940 --> 00:00:52,030
you could say, and you can have one

17
00:00:52,030 --> 00:00:54,710
or more collections in your database.

18
00:00:54,710 --> 00:00:56,810
And inside of your collections,

19
00:00:56,810 --> 00:00:59,590
you then have so-called documents.

20
00:00:59,590 --> 00:01:04,590
And documents look a bit like JavaScript objects here.

21
00:01:05,190 --> 00:01:10,190
Though I really wanna emphasize that NoSQL and MongoDB,

22
00:01:10,230 --> 00:01:12,940
what we're going to use in this section here

23
00:01:12,940 --> 00:01:16,630
and in this course in general, is not JavaScript,

24
00:01:16,630 --> 00:01:20,630
it's a database engine, it's a database system.

25
00:01:20,630 --> 00:01:25,630
But it has a lot of similarities with objects in JavaScript

26
00:01:25,790 --> 00:01:29,590
and with how we manage data in Java script.

27
00:01:29,590 --> 00:01:31,870
But of course the difference is that JavaScript

28
00:01:31,870 --> 00:01:35,560
is a programming language and any data we manage there

29
00:01:35,560 --> 00:01:39,720
is only managed in memory as long as our program is running,

30
00:01:39,720 --> 00:01:43,660
whereas NoSQL databases like MongoDB,

31
00:01:43,660 --> 00:01:47,070
our database is where data is stored in files

32
00:01:47,070 --> 00:01:49,140
so that it persists.

33
00:01:49,140 --> 00:01:51,140
So it might look similar when it comes

34
00:01:51,140 --> 00:01:52,980
to some of the data structures,

35
00:01:52,980 --> 00:01:56,733
but it's not a programming language, it's a database system.

36
00:01:57,650 --> 00:01:59,140
So we have these documents,

37
00:01:59,140 --> 00:02:01,910
which are a bit like JavaScript objects,

38
00:02:01,910 --> 00:02:04,050
which we store in collections.

39
00:02:04,050 --> 00:02:07,220
And these documents are simply pieces of data

40
00:02:07,220 --> 00:02:09,810
that have key value pairs,

41
00:02:09,810 --> 00:02:13,830
where the values can be strings, numbers

42
00:02:13,830 --> 00:02:17,870
or even nested objects or lists of data,

43
00:02:17,870 --> 00:02:20,523
arrays you would say in JavaScript.

44
00:02:22,230 --> 00:02:25,730
Now, if we take a closer look at collections and documents,

45
00:02:25,730 --> 00:02:26,930
that is worth noting,

46
00:02:26,930 --> 00:02:31,030
that collections are a bit like tables, as I already said,

47
00:02:31,030 --> 00:02:34,250
but unlike tables in SQL databases,

48
00:02:34,250 --> 00:02:38,150
you don't work with columns and rows,

49
00:02:38,150 --> 00:02:40,640
instead you really have just a data container

50
00:02:40,640 --> 00:02:44,550
where you can hold any kind of documents inside of it.

51
00:02:44,550 --> 00:02:49,350
So there you really kind of a broad mixture off data pieces,

52
00:02:49,350 --> 00:02:51,670
data entities that are stored

53
00:02:51,670 --> 00:02:54,180
in one and the same collection.

54
00:02:54,180 --> 00:02:57,620
And one very important thing to understand here

55
00:02:57,620 --> 00:03:00,180
is that the different documents

56
00:03:00,180 --> 00:03:04,070
that may be stored in the same collection don't have

57
00:03:04,070 --> 00:03:05,790
to have the same structure.

58
00:03:05,790 --> 00:03:07,710
They can be totally different.

59
00:03:07,710 --> 00:03:10,630
They just have to be pieces of data,

60
00:03:10,630 --> 00:03:14,430
like JavaScript objects, that have key value pairs,

61
00:03:14,430 --> 00:03:17,760
but you can have different keys in different documents

62
00:03:17,760 --> 00:03:20,670
that might live in the same collection.

63
00:03:20,670 --> 00:03:23,310
And that gives you a lot of flexibility

64
00:03:23,310 --> 00:03:25,670
because it means that as a developer,

65
00:03:25,670 --> 00:03:28,370
you don't have to plan your entire table

66
00:03:28,370 --> 00:03:31,320
and your data structure ahead of time,

67
00:03:31,320 --> 00:03:35,320
but just as your website might evolve over time,

68
00:03:35,320 --> 00:03:36,790
and at some point of time,

69
00:03:36,790 --> 00:03:40,070
you might decide that you wanna store different pieces

70
00:03:40,070 --> 00:03:42,100
of data than you did in the past,

71
00:03:42,100 --> 00:03:45,830
just as that might happen with NoSQL databases,

72
00:03:45,830 --> 00:03:47,820
you can simply make those adjustments

73
00:03:47,820 --> 00:03:49,400
in the existing database

74
00:03:49,400 --> 00:03:52,090
and just start storing new documents

75
00:03:52,090 --> 00:03:55,870
with a new structure side-by-side with older documents

76
00:03:55,870 --> 00:03:57,833
that might have a different structure.

77
00:03:58,670 --> 00:04:01,220
So that's a lot of extra flexibility,

78
00:04:01,220 --> 00:04:04,890
which you don't have in SQL databases.

79
00:04:04,890 --> 00:04:07,670
Now because we have that flexibility,

80
00:04:07,670 --> 00:04:11,150
we don't normalize our data and it's not the goal

81
00:04:11,150 --> 00:04:14,920
to split our data across a lot of tables.

82
00:04:14,920 --> 00:04:18,640
Instead, when we deal with NoSQL databases

83
00:04:18,640 --> 00:04:22,230
and we have kind of related data,

84
00:04:22,230 --> 00:04:27,230
then we have different options for storing such relations.

85
00:04:27,900 --> 00:04:30,220
NoSQL does support relations

86
00:04:30,220 --> 00:04:33,263
but we store them differently than we do in SQL.

87
00:04:34,140 --> 00:04:37,440
Instead of splitting it across a lot of different tables,

88
00:04:37,440 --> 00:04:42,440
as we would do it in SQL databases, in NoSQL databases,

89
00:04:42,470 --> 00:04:47,470
we very often store related data together in one document.

90
00:04:47,520 --> 00:04:51,850
So we might have nested documents or nested objects inside

91
00:04:51,850 --> 00:04:54,390
of another document or object.

92
00:04:54,390 --> 00:04:56,860
And it could look something like this.

93
00:04:56,860 --> 00:04:58,760
We might have a books collection,

94
00:04:58,760 --> 00:05:01,080
which has multiple book documents.

95
00:05:01,080 --> 00:05:03,820
Here you only see one on the bottom left,

96
00:05:03,820 --> 00:05:07,690
but that's just because I can't fit more on this slide,

97
00:05:07,690 --> 00:05:09,970
but we can have multiple books in there.

98
00:05:09,970 --> 00:05:12,370
And every book document might look like

99
00:05:12,370 --> 00:05:15,720
the one document you see here at the bottom left.

100
00:05:15,720 --> 00:05:17,950
And we have key value pairs here.

101
00:05:17,950 --> 00:05:21,900
And as you can see, some values are just strings,

102
00:05:21,900 --> 00:05:25,230
like the ID or the title of the book,

103
00:05:25,230 --> 00:05:29,740
but then the author, for example, is a nested object.

104
00:05:29,740 --> 00:05:32,680
It's a object instead of the other document.

105
00:05:32,680 --> 00:05:35,660
And it stores information about the author here,

106
00:05:35,660 --> 00:05:39,080
for example, which are more key value pairs,

107
00:05:39,080 --> 00:05:40,723
but in this nested object.

108
00:05:42,030 --> 00:05:44,700
And then we might also have a movies collection

109
00:05:44,700 --> 00:05:47,130
and we might have a relation here as well

110
00:05:47,130 --> 00:05:51,250
because some movies might be based on books.

111
00:05:51,250 --> 00:05:52,390
Now, in such cases,

112
00:05:52,390 --> 00:05:55,570
we might still wanna work with a separate collection

113
00:05:55,570 --> 00:05:58,960
because when you work with NoSQL databases,

114
00:05:58,960 --> 00:06:03,020
you typically store your data such that it best fits

115
00:06:03,020 --> 00:06:05,600
the queries you plan on running.

116
00:06:05,600 --> 00:06:08,830
And it's very likely that you might often fetch

117
00:06:08,830 --> 00:06:11,470
a list of books or a specific book,

118
00:06:11,470 --> 00:06:13,750
and that you often fetch a list of movies

119
00:06:13,750 --> 00:06:15,870
or a specific movie.

120
00:06:15,870 --> 00:06:19,290
But you might not that often fetch all the movies

121
00:06:19,290 --> 00:06:21,190
that belong to one book.

122
00:06:21,190 --> 00:06:22,660
That might sometimes happen,

123
00:06:22,660 --> 00:06:26,830
but probably not as often as your other queries.

124
00:06:26,830 --> 00:06:28,600
And therefore for such cases,

125
00:06:28,600 --> 00:06:31,370
you might just store IDs of movies

126
00:06:31,370 --> 00:06:34,040
in your books collection documents,

127
00:06:34,040 --> 00:06:37,410
as you see it here on the bottom left with the movies key,

128
00:06:37,410 --> 00:06:39,480
and then still have the movie documents

129
00:06:39,480 --> 00:06:41,410
in a separate collection.

130
00:06:41,410 --> 00:06:45,400
So you can split related data across different collections,

131
00:06:45,400 --> 00:06:48,060
but often you will all emerge together.

132
00:06:48,060 --> 00:06:48,970
And in the end,

133
00:06:48,970 --> 00:06:52,430
it comes down to how you plan on querying your data

134
00:06:53,330 --> 00:06:56,410
because that's the most important takeaway here.

135
00:06:56,410 --> 00:06:59,750
With NoSQL databases, like MongoDB,

136
00:06:59,750 --> 00:07:02,390
it's all about thinking about the kind

137
00:07:02,390 --> 00:07:05,150
of queries you're going to execute.

138
00:07:05,150 --> 00:07:08,310
What are you going to do with your data?

139
00:07:08,310 --> 00:07:10,640
And that's quite different from SQL.

140
00:07:10,640 --> 00:07:15,200
With SQL, you plan to store your data in a normalized form,

141
00:07:15,200 --> 00:07:17,540
separated across a lot of tables

142
00:07:17,540 --> 00:07:21,120
and it's all about having this fixed structure.

143
00:07:21,120 --> 00:07:24,000
with NoSQL, it's all about flexibility

144
00:07:24,000 --> 00:07:27,280
and you wanna store your data such that your queries,

145
00:07:27,280 --> 00:07:29,330
your commands that you execute,

146
00:07:29,330 --> 00:07:32,900
can be as short and efficient as possible

147
00:07:32,900 --> 00:07:35,070
so that you don't have to do a lot of merging

148
00:07:35,070 --> 00:07:38,030
and so on, which is quite common for SQL

149
00:07:38,030 --> 00:07:40,243
but not the plan for NoSQL.

150
00:07:41,270 --> 00:07:42,960
So therefore with NoSQL,

151
00:07:42,960 --> 00:07:45,430
you really wanna have efficient queries

152
00:07:45,430 --> 00:07:48,560
and therefore data that's frequently queried together

153
00:07:48,560 --> 00:07:49,860
should be stored together.

154
00:07:50,710 --> 00:07:53,970
And if you have data that's not queried together that often,

155
00:07:53,970 --> 00:07:57,070
you might split it across different collections.

156
00:07:57,070 --> 00:07:59,790
And this might sound quite complex

157
00:07:59,790 --> 00:08:03,070
and it might sound like a case where it's easy

158
00:08:03,070 --> 00:08:06,050
to make mistakes and that's kind of correct,

159
00:08:06,050 --> 00:08:07,120
but on the other hand,

160
00:08:07,120 --> 00:08:10,860
it's also simply something that comes with experience.

161
00:08:10,860 --> 00:08:13,210
And it can have a lot of advantages

162
00:08:13,210 --> 00:08:15,550
that you can optimize your collections

163
00:08:15,550 --> 00:08:18,500
and your data for your queries.

164
00:08:18,500 --> 00:08:20,150
Because with NoSQL,

165
00:08:20,150 --> 00:08:22,180
it is probably easier

166
00:08:22,180 --> 00:08:25,480
to come up with the overall database system

167
00:08:25,480 --> 00:08:28,940
and you don't have to think about the queries ahead of time,

168
00:08:28,940 --> 00:08:31,130
but therefore there on the other hand,

169
00:08:31,130 --> 00:08:34,730
you pay the price by having less flexibility

170
00:08:34,730 --> 00:08:37,330
and in very large applications,

171
00:08:37,330 --> 00:08:41,210
maybe even some performance issues because your queries can

172
00:08:41,210 --> 00:08:45,290
become very complex there with a lot of combined joins.

173
00:08:45,290 --> 00:08:48,100
That's something which you can avoid more easily

174
00:08:48,100 --> 00:08:49,123
when you use NoSQL.

175
00:08:50,500 --> 00:08:53,790
But ultimately using these database systems correctly

176
00:08:53,790 --> 00:08:56,320
all comes down to simply working with them

177
00:08:56,320 --> 00:08:59,670
and gaining some experience and making mistakes.

178
00:08:59,670 --> 00:09:03,230
That's a key part of learning anything anyways.

179
00:09:03,230 --> 00:09:05,470
And therefore now that's the theory,

180
00:09:05,470 --> 00:09:09,080
may be a bit intimidating, but we'll get there step-by-step

181
00:09:09,080 --> 00:09:10,670
throughout this course section

182
00:09:10,670 --> 00:09:12,870
and throughout this course in general,

183
00:09:12,870 --> 00:09:14,010
and therefore now,

184
00:09:14,010 --> 00:09:16,970
let's now have a look at the different options we have

185
00:09:16,970 --> 00:09:20,403
when it comes to working with NoSQL databases.

