1
00:00:05,180 --> 00:00:11,240
In this lecture, we're going to be talking about the very important join keyword, which allows us

2
00:00:11,240 --> 00:00:14,330
to write SQL queries that span multiple tables.

3
00:00:14,330 --> 00:00:19,220
We can pull different information from multiple tables and write more complex queries.

4
00:00:19,220 --> 00:00:27,320
We have already seen this when we did our view back in our production database, and the join statements

5
00:00:27,320 --> 00:00:33,800
were generated by the view user interface, and we were able to make a nice looking view that had data

6
00:00:33,800 --> 00:00:34,820
from multiple tables.

7
00:00:34,820 --> 00:00:41,300
So the point of this lecture is going to show you how to use the join statement and the different types

8
00:00:41,300 --> 00:00:45,470
of join statements, so that you can write these in your SQL queries.

9
00:00:45,470 --> 00:00:51,110
And as we stated earlier, the strength of a relational database model is in the relationships that

10
00:00:51,110 --> 00:00:53,990
can be constructed between data items in the table.

11
00:00:53,990 --> 00:01:01,280
So here in this diagram for our battery code database, we can see that there is a relationship between

12
00:01:01,280 --> 00:01:03,920
the departments table and the employees table.

13
00:01:03,920 --> 00:01:06,710
And it centers around the department ID.

14
00:01:06,830 --> 00:01:12,350
So in the departments table which is a table that lists all the departments, it has an ID and a name.

15
00:01:12,350 --> 00:01:17,810
And then in employees there is a field that tells what department ID they're in.

16
00:01:17,840 --> 00:01:23,960
We can take advantage of this relationship when we are writing SQL queries to print out the department

17
00:01:23,960 --> 00:01:25,550
name instead of the ID.

18
00:01:25,640 --> 00:01:29,180
But we first have to effectively join these tables.

19
00:01:29,180 --> 00:01:32,150
So let's take a look at how to do that.

20
00:01:32,150 --> 00:01:35,360
So we'll start in our battery code database.

21
00:01:35,360 --> 00:01:40,160
And we're going to look at our employees table here I'll need to select it here.

22
00:01:40,160 --> 00:01:43,280
And I am just going to do select.

23
00:01:43,280 --> 00:01:46,280
And I'm just going to drag out some columns here.

24
00:01:46,280 --> 00:01:55,220
So employee ID first name last name and department ID from.

25
00:01:55,220 --> 00:01:59,840
And then I will drag out the table name and we'll just execute this.

26
00:01:59,840 --> 00:02:05,390
This is just a really simple query showing our employee ID first name, last name and department ID,

27
00:02:05,390 --> 00:02:12,800
and we know that the employees table and the departments table are related on department ID.

28
00:02:12,800 --> 00:02:19,550
So if we go look in our departments, we see that it's just a list of department IDs and names.

29
00:02:19,550 --> 00:02:26,420
So we can take advantage of the relationship to print out the department name instead of just the department

30
00:02:26,420 --> 00:02:28,760
ID, which is just kind of useless.

31
00:02:28,760 --> 00:02:31,110
And so there's a couple of ways to do this.

32
00:02:31,110 --> 00:02:32,610
I'm going to show you both.

33
00:02:32,610 --> 00:02:39,690
So the first way is to use the join keyword to create the relationship.

34
00:02:39,690 --> 00:02:42,300
And I am just going to copy this.

35
00:02:42,300 --> 00:02:47,370
And so we will just put in join which defaults to an inner join.

36
00:02:47,370 --> 00:02:52,470
And we are going to pull out the table that we want to join.

37
00:02:52,470 --> 00:02:59,120
And then we're going to put on, and then we're going to put in the relationships that we need.

38
00:02:59,120 --> 00:03:06,920
So I will do employees dot department ID is equal to dot department ID.

39
00:03:06,920 --> 00:03:09,950
So that is going to create the relationship.

40
00:03:09,950 --> 00:03:18,740
And then what we want to do is just get rid of that and grab our department name and pull it up here.

41
00:03:18,740 --> 00:03:24,770
Now when we run this, we have the department names instead of the department ID.

42
00:03:24,770 --> 00:03:30,050
Now there's a couple other things I want to show you about this query before we kind of move on.

43
00:03:30,050 --> 00:03:32,690
So I am just going to copy this.

44
00:03:32,690 --> 00:03:37,250
And this is going to be using Table alias.

45
00:03:37,250 --> 00:03:40,520
And this is just a way to kind of make things easier.

46
00:03:40,520 --> 00:03:41,840
And you'll see this a lot.

47
00:03:41,840 --> 00:03:46,610
So we've got two tables here, but when you have a large join statement, always putting the table name

48
00:03:46,610 --> 00:03:47,810
can be a little tedious.

49
00:03:47,810 --> 00:03:57,500
So we could say from DBO employees E, which means E is going to be an alias we don't have to put as.

50
00:03:57,500 --> 00:03:59,660
And then we could do departments.

51
00:03:59,660 --> 00:04:08,150
D and then it's easier to just say E dot department id equals d dot department ID and that makes it

52
00:04:08,150 --> 00:04:12,950
a little bit easier to deal with, especially when you have a bigger query.

53
00:04:12,950 --> 00:04:19,370
And the other thing I want to show you is, well, let's say that I just want to also print out our

54
00:04:19,370 --> 00:04:20,960
department ID, okay.

55
00:04:20,960 --> 00:04:26,870
So notice that we got the red squiggly line and the error says it's an ambiguous column.

56
00:04:26,870 --> 00:04:32,570
That's because the department ID exists in both tables, and that's kind of the point of this.

57
00:04:32,570 --> 00:04:35,210
But SQL doesn't know which one we need.

58
00:04:35,210 --> 00:04:38,690
Now we know that it doesn't matter because there are going to be the same.

59
00:04:38,690 --> 00:04:42,860
So what we have to do is define which one.

60
00:04:42,860 --> 00:04:48,710
And I could do e dot department ID or I could do dot department ID it wouldn't matter.

61
00:04:48,710 --> 00:04:55,010
But if you have the same column names you have to define which table it's coming from and it's going

62
00:04:55,010 --> 00:04:55,580
to work.

63
00:04:55,580 --> 00:04:57,530
Now, there's another way to do this statement.

64
00:04:57,530 --> 00:05:02,120
And you may have seen this, or you may see this in the future and it may be confusing.

65
00:05:02,120 --> 00:05:03,560
So I'm just going to show you this.

66
00:05:03,560 --> 00:05:07,130
I've heard this called an equi join or an equivalent join.

67
00:05:07,130 --> 00:05:09,200
And so that's what I'm going to put here.

68
00:05:09,200 --> 00:05:10,850
Equivalent join statement.

69
00:05:10,850 --> 00:05:14,420
And so you don't use the join keyword here.

70
00:05:14,420 --> 00:05:19,850
What you do is you just list your tables And I really like to use aliases.

71
00:05:19,850 --> 00:05:25,790
So we've got select our column names from and then we've got select our tables.

72
00:05:25,790 --> 00:05:33,410
And then in the where statement you just say where the employees department ID equals the department

73
00:05:33,410 --> 00:05:33,920
ID.

74
00:05:34,070 --> 00:05:38,060
And if you do it like this it gives you the exact same results.

75
00:05:38,060 --> 00:05:43,400
When SQL processes this query it just chooses the most optimal path.

76
00:05:43,400 --> 00:05:47,990
So does it matter whether or not you do the join statement or this?

77
00:05:47,990 --> 00:05:52,550
And the answer is it depends on who you're working with or who you're working for.

78
00:05:52,550 --> 00:05:57,560
Some companies want you to use the join keyword and they'll say, well, if you use join, keyword is

79
00:05:57,560 --> 00:05:58,520
easier to read.

80
00:05:58,520 --> 00:06:01,430
It's much clearer what's happening where here you don't.

81
00:06:01,430 --> 00:06:07,520
That can be a valid statement, but it ends up being a matter of preference and a matter of what policies

82
00:06:07,520 --> 00:06:10,280
your company has on programming and things like that.

83
00:06:10,280 --> 00:06:13,550
So these are the two ways to do a joint statement.

84
00:06:13,550 --> 00:06:19,700
Let's do one more quick example here where I'm going to look at our orders table.

85
00:06:19,700 --> 00:06:21,980
So I will do select.

86
00:06:21,980 --> 00:06:24,080
And then here's our orders table.

87
00:06:24,080 --> 00:06:32,000
So let's say that we want the order ID the customer ID and then the order date and employee ID.

88
00:06:32,150 --> 00:06:36,920
I guess we're just going to go ahead and get all this stuff we will do from orders.

89
00:06:36,950 --> 00:06:37,430
Okay.

90
00:06:37,430 --> 00:06:40,580
So if we print this out it's going to be pretty much what you expect.

91
00:06:40,580 --> 00:06:45,560
But now we want to go join our customer ID, our employee ID and our stock ID.

92
00:06:45,560 --> 00:06:48,470
And so we're going to basically do the same thing.

93
00:06:48,470 --> 00:06:50,420
I'm going to use the join statement.

94
00:06:50,420 --> 00:06:56,000
And I think the join statements a lot clearer and a lot cleaner because it keeps the joint statements

95
00:06:56,000 --> 00:06:57,260
out of your Where statement.

96
00:06:57,260 --> 00:06:59,570
So we'll do join.

97
00:06:59,570 --> 00:07:02,750
And our first table is going to be our customers table.

98
00:07:02,750 --> 00:07:05,330
So I'm just going to go grab customers.

99
00:07:05,330 --> 00:07:07,790
And then we do our on keyword.

100
00:07:07,790 --> 00:07:10,580
And then we define the relationship.

101
00:07:10,580 --> 00:07:14,120
And so our relationship is going to be on customer ID.

102
00:07:14,120 --> 00:07:17,330
And let's go ahead and do our aliases.

103
00:07:17,340 --> 00:07:22,320
So orders oh customers see on oh dot.

104
00:07:22,320 --> 00:07:25,710
And notice how the intellisense knows exactly what I'm doing here.

105
00:07:25,710 --> 00:07:30,660
So oh dot customer ID is equal to c dot customer ID.

106
00:07:30,660 --> 00:07:35,730
And then instead of customers ID and I like to go ahead and replace the column name.

107
00:07:35,730 --> 00:07:41,190
As I'm doing this, we'll go to our customers table and I can pick any of these.

108
00:07:41,190 --> 00:07:50,960
So I'll just do first name us and then I'll do a space and then we'll do last name alias here.

109
00:07:50,960 --> 00:07:53,690
And then let's just kind of test this out.

110
00:07:53,690 --> 00:07:56,390
So now we've got our customer name.

111
00:07:56,390 --> 00:07:59,420
And then we're just going to repeat this process.

112
00:07:59,420 --> 00:08:01,100
So we'll do join.

113
00:08:01,100 --> 00:08:03,290
The next one is going to be our employee ID.

114
00:08:03,290 --> 00:08:06,140
So it's going to be our employees table.

115
00:08:06,140 --> 00:08:10,010
And then we can do an E and then we'll do our on keyword.

116
00:08:10,010 --> 00:08:15,950
And then it's going to be O dot employee ID equals e dot employee ID.

117
00:08:15,950 --> 00:08:19,490
Isn't that a lot easier than just pulling out the entire table name.

118
00:08:19,490 --> 00:08:22,190
And then I will replace my employee ID.

119
00:08:22,220 --> 00:08:25,970
They have different nomenclature for first name and last name.

120
00:08:25,970 --> 00:08:30,020
So f name and l name from our orders table.

121
00:08:30,020 --> 00:08:31,760
And then I'm going to do the same thing.

122
00:08:31,760 --> 00:08:36,110
So I'll do first name plus space.

123
00:08:36,110 --> 00:08:41,330
And so notice that I didn't put table designations for first name and last name because I don't have

124
00:08:41,330 --> 00:08:44,870
to because there's no other join tables that has those.

125
00:08:44,870 --> 00:08:49,730
The last thing is a stock ID that is going to be our inventory table.

126
00:08:49,730 --> 00:08:53,870
So I'll just drag this out and it's called I.

127
00:08:53,870 --> 00:08:56,840
And then we will define our relationship.

128
00:08:56,840 --> 00:09:01,610
So it's going to be O dot stock ID is equal to.

129
00:09:01,610 --> 00:09:09,020
And then we can do I dot stock ID and then we'll go to our stock ID go to our inventory table.

130
00:09:09,380 --> 00:09:12,650
And we'll just use the description for that.

131
00:09:12,800 --> 00:09:17,240
And let's highlight this and execute.

132
00:09:17,240 --> 00:09:23,240
And that is a much nicer result set there because we've made all our relationships.

133
00:09:23,240 --> 00:09:27,170
And instead of seeing customer 100 we've got al dente.

134
00:09:27,170 --> 00:09:30,860
And instead of seeing employee name 27, we got Kathleen Poitras.

135
00:09:30,890 --> 00:09:38,330
And suppose if we want to order this by our, say, customer name, I can just drop it in and treat

136
00:09:38,330 --> 00:09:43,430
customer name just like a regular column because it's been built in the query.

137
00:09:43,430 --> 00:09:50,480
And if we do that, it's totally fine and we can order the results by this alias.

138
00:09:50,480 --> 00:09:54,200
So for this last example we're going to get a little more complex.

139
00:09:54,200 --> 00:09:58,100
And we're going to use some concepts from the last few lectures.

140
00:09:58,100 --> 00:10:00,710
Our boss has asked us to create a report.

141
00:10:00,710 --> 00:10:07,280
And the report that she wants is all employees with ID first and last name, and the total number of

142
00:10:07,280 --> 00:10:13,580
orders they are associated with from the orders table, even if they have no orders, in which case

143
00:10:13,580 --> 00:10:14,720
should be zero.

144
00:10:14,720 --> 00:10:20,360
And she wants the report ordered by the employee name sounds pretty straightforward.

145
00:10:20,360 --> 00:10:26,870
We have our employees table kind of opened up, so we'll go ahead and get the initial query in select

146
00:10:26,870 --> 00:10:30,230
and we will do employee ID comma.

147
00:10:30,230 --> 00:10:31,910
We'll do first name.

148
00:10:31,910 --> 00:10:37,070
We'll go ahead and just build the name like we have been.

149
00:10:37,520 --> 00:10:44,570
So there's my first and last name as and this is going to be imp name.

150
00:10:44,570 --> 00:10:52,340
And we'll go ahead and do from employees and just get some initial results here.

151
00:10:52,820 --> 00:10:53,240
Okay.

152
00:10:53,240 --> 00:10:54,500
So that looks good.

153
00:10:54,500 --> 00:10:57,620
So we've got our employee first and last name.

154
00:10:57,620 --> 00:11:00,920
So now we need the total number of orders which means two things.

155
00:11:00,920 --> 00:11:04,760
Number one total number of orders is an aggregate.

156
00:11:04,760 --> 00:11:06,170
So that's going to be count.

157
00:11:06,170 --> 00:11:09,350
Second thing is it's going to come from the orders table.

158
00:11:09,350 --> 00:11:12,080
So we're going to need to join the orders table.

159
00:11:12,080 --> 00:11:16,760
So I'll just scroll up here a little bit and we're going to do join.

160
00:11:16,760 --> 00:11:20,000
We'll grab orders the on keyword.

161
00:11:20,000 --> 00:11:22,880
We'll put our alias here to make this easy.

162
00:11:22,880 --> 00:11:32,000
And the relationship is going to be on the employee ID employee ID is equal to the orders table employee

163
00:11:32,000 --> 00:11:32,480
ID.

164
00:11:32,480 --> 00:11:36,230
And then of course we can't leave this this ambiguous.

165
00:11:36,230 --> 00:11:43,280
So we'll just do E dot employee ID and now we can add our total number of orders.

166
00:11:43,280 --> 00:11:45,800
So we'll do our aggregate function.

167
00:11:45,800 --> 00:11:49,700
We know that it is going to be an aggregate on our orders table.

168
00:11:49,700 --> 00:11:54,290
What we're really just looking for is the order ID.

169
00:11:54,320 --> 00:12:00,200
Notice that the intellisense doesn't work because I'm in this aggregate, and it kind of doesn't know

170
00:12:00,200 --> 00:12:00,980
what to do.

171
00:12:00,980 --> 00:12:08,210
We're just going to put any unique field in here, which is going to be order ID and that should give

172
00:12:08,210 --> 00:12:14,030
us the aggregate and we'll go ahead and alias it and we'll do as number orders.

173
00:12:14,060 --> 00:12:21,950
Now since we're using an aggregate, if you may recall, we can't use just regular column names without

174
00:12:21,950 --> 00:12:22,790
grouping them.

175
00:12:22,790 --> 00:12:28,850
And that is basically going to make our count return a single value for each group.

176
00:12:28,850 --> 00:12:32,000
Right now if we try to run this, we're going to get an error.

177
00:12:32,000 --> 00:12:38,630
And so we have to do group by and we will do our employee ID.

178
00:12:38,990 --> 00:12:45,800
And then can we get away with doing this like we did in our last example.

179
00:12:45,800 --> 00:12:48,020
And the answer is no you can't.

180
00:12:48,020 --> 00:12:51,890
So the group by function is not going to recognize an alias.

181
00:12:51,890 --> 00:12:52,850
And that's okay.

182
00:12:52,850 --> 00:13:00,590
We will just also group them by the last name and then finally by the first name.

183
00:13:00,590 --> 00:13:04,700
And then finally we need to order by the employee name.

184
00:13:05,060 --> 00:13:07,850
And that is this alias right here.

185
00:13:07,850 --> 00:13:11,570
We can do order by employee name.

186
00:13:11,570 --> 00:13:15,890
And if we run this we will get some results back in.

187
00:13:15,890 --> 00:13:16,550
That's great.

188
00:13:16,550 --> 00:13:23,450
But one of the great rules in SQL programming is just because you get results doesn't mean they're right.

189
00:13:23,450 --> 00:13:25,910
We only have 11 things here.

190
00:13:25,910 --> 00:13:30,530
If we go back and look at our employees table, and I'll just do select all rows.

191
00:13:30,530 --> 00:13:32,690
We only have 75 rows here.

192
00:13:32,690 --> 00:13:34,400
So we're missing some.

193
00:13:34,940 --> 00:13:39,980
And so what the problem is this is by default an inner join.

194
00:13:39,980 --> 00:13:42,560
So if I run this results are not going to change.

195
00:13:42,590 --> 00:13:52,130
An inner join will return all rows where there is a matching row with the exact same value, so it is

196
00:13:52,130 --> 00:14:00,260
only returning rows from the order table where it finds a valid employee ID which matches the current

197
00:14:00,260 --> 00:14:00,980
row.

198
00:14:01,010 --> 00:14:08,180
So that means for any employees who do not have any orders, it will not give us results.

199
00:14:08,180 --> 00:14:14,000
And that is not what we want because we wanted the total number of orders, even if they're zero.

200
00:14:14,000 --> 00:14:20,150
And so instead of an inner join, we're going to do a left outer join.

201
00:14:20,300 --> 00:14:27,770
And an outer join basically means we are going to return the results of both tables.

202
00:14:27,770 --> 00:14:30,950
And if there is a match, we'll include that information.

203
00:14:30,950 --> 00:14:37,340
If there's not a match, it will either give you null or it will give you zero if it's numerical.

204
00:14:37,610 --> 00:14:43,910
What the left means is the left table and your primary table.

205
00:14:43,910 --> 00:14:47,750
Your first table in your query is always your left table.

206
00:14:47,750 --> 00:14:51,950
And then what you're joining on is always your right table.

207
00:14:51,950 --> 00:14:59,580
We want all employees, so that means we have to do a left join because we want all records from our

208
00:14:59,580 --> 00:15:00,840
left table.

209
00:15:00,840 --> 00:15:04,770
We'll take a look at what happens if we change this to right a little bit later on.

210
00:15:05,040 --> 00:15:10,320
So now we've told it we want all records and let's run it.

211
00:15:10,320 --> 00:15:12,390
And that looks great.

212
00:15:12,390 --> 00:15:20,870
We are ordered by our employee name and we are including all 75 of our employees, and even the ones

213
00:15:20,870 --> 00:15:26,300
who haven't sold anything have a zero here as the value.

214
00:15:26,300 --> 00:15:32,720
So we have successfully done our report as it was given to us by our boss.

215
00:15:32,750 --> 00:15:40,850
Now let's take a quick look at what we get if we change this to a right outer join and a right outer

216
00:15:40,850 --> 00:15:46,820
join means return all records from the right table, which is orders.

217
00:15:46,850 --> 00:15:49,700
Okay, so we're not going to get all employees.

218
00:15:49,700 --> 00:15:53,750
We're going to get all orders that match this.

219
00:15:53,750 --> 00:16:00,890
And if we run this, it is exactly the same results as if we did an inner join.

220
00:16:00,890 --> 00:16:07,340
And so what's important here is knowing that we have to use left or right when we're doing an outer

221
00:16:07,340 --> 00:16:14,840
join, and that outer join is going to give us zero or null results based on if we want them or not.

222
00:16:15,080 --> 00:16:19,190
And before we finish this lecture, let me show you something that's really cool.

223
00:16:19,190 --> 00:16:27,410
In the second section of this class, I showed you how to create a view, and that is with the views

224
00:16:27,410 --> 00:16:28,070
interface.

225
00:16:28,070 --> 00:16:34,040
So you can go to Create New View and you can use this graphic user interface to create a view.

226
00:16:34,040 --> 00:16:39,470
And what I want to show you to kind of bring everything full circle is I'm going to grab this query

227
00:16:39,470 --> 00:16:39,740
here.

228
00:16:39,740 --> 00:16:42,140
I'm not going to get the order by clause.

229
00:16:42,140 --> 00:16:43,850
And I'll tell you why in a moment.

230
00:16:43,850 --> 00:16:46,370
And we'll just run that so we can see what it looks like.

231
00:16:46,370 --> 00:16:46,970
That's great.

232
00:16:46,970 --> 00:16:48,590
I'm going to right click copy.

233
00:16:48,890 --> 00:16:52,250
And then I am going to go to Views New View.

234
00:16:52,970 --> 00:16:56,330
And I'm not going to add any tables, I'm just going to hit close.

235
00:16:56,420 --> 00:17:02,840
And then in the SQL statement, I am going to paste that in and just click off of that.

236
00:17:02,840 --> 00:17:07,010
And it takes my SQL query and it builds the view.

237
00:17:07,010 --> 00:17:10,100
So that view is built from the SQL query.

238
00:17:10,100 --> 00:17:13,940
That's why you can go in and modify it and you'll see the changes.

239
00:17:13,940 --> 00:17:19,400
And so let's say that this is a view that we really want to use a lot.

240
00:17:19,400 --> 00:17:26,660
I can execute it, make sure I'm getting my results, and then I can just go save this and we'll save

241
00:17:26,660 --> 00:17:32,810
it as V for view employee orders I can hit okay.

242
00:17:32,810 --> 00:17:39,800
And then if I go under my views and right click refresh, I've got this new view and I can right click

243
00:17:39,800 --> 00:17:41,780
select top 1000 rows.

244
00:17:41,780 --> 00:17:47,150
And then the reason why we don't do the order by is because you always do the order by when you're querying

245
00:17:47,150 --> 00:17:47,330
it.

246
00:17:47,330 --> 00:17:50,690
So I could do order by order by.

247
00:17:50,690 --> 00:17:53,180
And then we can do orders okay.

248
00:17:53,180 --> 00:17:55,490
So now it's doing my orders.

249
00:17:55,490 --> 00:18:00,320
We could even do a reverse order here using our descending keyword.

250
00:18:00,320 --> 00:18:02,540
And so that's actually a really nice thing.

251
00:18:02,540 --> 00:18:08,760
So if you pull out some queries that you want to keep reusing, you don't have to keep rewriting that.

252
00:18:08,760 --> 00:18:11,370
You can just save it as a view.

253
00:18:11,430 --> 00:18:15,570
So that's the lecture on inner and outer joins coming up.

254
00:18:15,570 --> 00:18:16,620
We'll have an assignment.

255
00:18:16,620 --> 00:18:22,110
We will test your knowledge on writing some join statements and answering some questions.

256
00:18:22,110 --> 00:18:24,720
So I'll see you in the next assignment.
