1
00:00:00,090 --> 00:00:05,580
We've already talked about how we use a probability math function to model the probability associated

2
00:00:05,580 --> 00:00:07,980
with a discrete random variable.

3
00:00:08,189 --> 00:00:13,890
We'll use a probability density function to model the probability associated with a continuous random

4
00:00:13,890 --> 00:00:14,570
variable.

5
00:00:14,580 --> 00:00:21,540
So if you remember, we looked at the probability mass function for a DAI role, and when we graphed

6
00:00:21,540 --> 00:00:27,420
that probability mass function, we showed that the probability associated with rolling a one, two,

7
00:00:27,420 --> 00:00:34,540
three, four, five or six was always one out of six or approximately 0.167.

8
00:00:34,560 --> 00:00:41,760
And so the sketch of that probability mass function looked like this, where we have a uniform distribution

9
00:00:41,760 --> 00:00:45,210
because the probability remains uniform, it's always the same.

10
00:00:45,210 --> 00:00:48,780
The probability is equal for rolling each of these values.

11
00:00:48,780 --> 00:00:54,840
When we roll a six sided DAI one time, what we notice here is that if we look at the graph of this

12
00:00:54,840 --> 00:01:02,160
probability mass function, we're only defining probabilities for these exact values here, the exact

13
00:01:02,160 --> 00:01:11,040
value one, the exact value to the exact value three etc. We don't define a probability for one and

14
00:01:11,040 --> 00:01:16,800
a half or two and a half or any value between one, two, three, four, five and six.

15
00:01:16,800 --> 00:01:23,130
And of course that's because with the probability mass function, we're dealing with a discrete random

16
00:01:23,130 --> 00:01:28,770
variable where we have those discrete countable values and no values in between.

17
00:01:28,980 --> 00:01:35,190
Remember that a continuous random variable in contrast, can be defined for an infinite set of values,

18
00:01:35,190 --> 00:01:38,280
not just a discrete set of values like this.

19
00:01:38,280 --> 00:01:39,660
One, two, three, four, five, six.

20
00:01:39,690 --> 00:01:46,080
In other words, if we were to sketch the graph of a probability density function for a continuous random

21
00:01:46,080 --> 00:01:51,840
variable as opposed to a probability mass function for a discrete random variable, we would use a similar

22
00:01:51,840 --> 00:01:58,020
set of axes like this, the horizontal axis x and the vertical axis indicating probability.

23
00:01:58,020 --> 00:02:03,240
But the sketch of our probability density function might be a curve that looks something like this.

24
00:02:03,240 --> 00:02:07,620
And then what we would do is identify values of X.

25
00:02:07,620 --> 00:02:12,600
So maybe this value here, x one, this value here x two.

26
00:02:12,600 --> 00:02:19,230
But in theory, for this continuous random variable, we could define an infinite number of values in

27
00:02:19,230 --> 00:02:25,110
between x one and x two, and we could think about the probability of each of those values.

28
00:02:25,110 --> 00:02:31,080
And remember, with a continuous random variable, we can always get more and more granular with these

29
00:02:31,080 --> 00:02:36,090
values as long as we continue measuring out to a larger number of decimal places.

30
00:02:36,090 --> 00:02:42,780
So really there are an infinite number of values along this horizontal axis for the continuous random

31
00:02:42,780 --> 00:02:49,470
variable, in contrast with this discrete random variable where we just have this discrete countable

32
00:02:49,500 --> 00:02:50,340
value set.

33
00:02:50,340 --> 00:02:57,480
So this is the kind of variable, the kind of scenario we're trying to model with a probability function.

34
00:02:57,480 --> 00:03:05,130
And in this case we can say that every continuous random variable x has a probability density function

35
00:03:05,130 --> 00:03:06,030
F of X.

36
00:03:06,030 --> 00:03:12,480
And what we can say about this probability density function is that the probability that the continuous

37
00:03:12,480 --> 00:03:18,780
random variable x takes on some value on the interval A to B is given by this expression.

38
00:03:18,780 --> 00:03:23,100
Over here on the right, where this notation here is called the integral.

39
00:03:23,100 --> 00:03:27,260
This is the integral from A to B of the function F of x.

40
00:03:27,270 --> 00:03:34,350
This notation here is like the other side of the integral notation, the integral and the d sandwich,

41
00:03:34,350 --> 00:03:40,560
whatever is between them and tells us to perform a specific operation with this function F of x which

42
00:03:40,560 --> 00:03:41,400
we'll talk about.

43
00:03:41,400 --> 00:03:46,680
Notice here also that now that we're switching to a continuous random variable, when we're calculating

44
00:03:46,680 --> 00:03:53,430
probability, we have to calculate probability over an interval A to B, whereas with the discrete random

45
00:03:53,430 --> 00:03:59,250
variable we calculated the probability of a specific value one value at a time.

46
00:03:59,250 --> 00:04:04,830
So here we have the probability of rolling a one, the probability of rolling it to the probability

47
00:04:04,830 --> 00:04:05,940
of rolling a three.

48
00:04:05,940 --> 00:04:11,550
When we transition to the continuous random variable and the probability density function, we are no

49
00:04:11,550 --> 00:04:15,810
longer going to calculate the probability that X takes on a specific value.

50
00:04:15,990 --> 00:04:22,770
And the reason for that is because it's essentially impossible for us to measure the probability that

51
00:04:22,770 --> 00:04:27,930
X takes on an exact value when we're dealing with a continuous random variable.

52
00:04:27,930 --> 00:04:33,600
So let's say, for instance, that the continuous random variable that the probability density function

53
00:04:33,600 --> 00:04:39,420
associated with this continuous random variable here, we've sketched it and maybe it model's finishing

54
00:04:39,450 --> 00:04:41,760
times for a standardized test.

55
00:04:41,790 --> 00:04:47,010
Let's say here that this is the mean finishing time for the test.

56
00:04:47,010 --> 00:04:50,880
And that mean finishing time, let's say, is 71.

57
00:04:51,730 --> 00:04:59,260
Minutes such that let's say here this X to value is a finishing time of we'll call it.

58
00:05:00,140 --> 00:05:01,000
80 minutes.

59
00:05:01,010 --> 00:05:07,760
Well, if we want to calculate the probability that a student's finishing time is exactly 71 minutes,

60
00:05:07,760 --> 00:05:17,270
that's virtually impossible for us to do because they might finish at 71 minutes and 0.0000000 1 seconds.

61
00:05:17,300 --> 00:05:25,460
In other words, they might be off of 71 minutes by a value of time so small that it's almost immeasurable.

62
00:05:25,460 --> 00:05:31,730
And because of that theoretical idea where it's basically impossible for us to measure that, they finished

63
00:05:31,730 --> 00:05:39,410
truly at exactly 71 minutes because remember, we can always divide time into smaller and smaller and

64
00:05:39,410 --> 00:05:40,700
smaller slices.

65
00:05:40,700 --> 00:05:46,280
We can't really ever say the probability that a student finished at exactly one minute.

66
00:05:46,280 --> 00:05:53,270
And so with a continuous random variable and its probability density function, the probability of any

67
00:05:53,270 --> 00:06:00,380
specific exact value in the range of the continuous random variable, the probability of any exact value

68
00:06:00,380 --> 00:06:01,900
is actually zero.

69
00:06:01,910 --> 00:06:11,420
So for a variable like this one, the probability that x is exactly equal to 71 minutes is zero.

70
00:06:11,630 --> 00:06:18,260
The probability that X is exactly equal to 80 minutes is also zero.

71
00:06:18,290 --> 00:06:26,060
The probability that the variable takes on any specific exact value is always zero, because in theory,

72
00:06:26,060 --> 00:06:31,280
for a continuous random variable, we can't pinpoint exactly 71 minutes.

73
00:06:31,280 --> 00:06:35,930
We're always going to be off by some infinitely small fraction of a second.

74
00:06:35,930 --> 00:06:42,560
However, we can calculate probability over some interval A to B, for instance, we could think about

75
00:06:42,560 --> 00:06:48,170
the probability that the student's finishing time is somewhere between 71 minutes and 80 minutes.

76
00:06:48,170 --> 00:06:53,570
And in the same way over here with the probability mass function for a discrete random variable, the

77
00:06:53,570 --> 00:07:01,550
probability of one specific value is given by this column or this bar with a continuous random variable

78
00:07:01,550 --> 00:07:03,380
and the probability density function.

79
00:07:03,380 --> 00:07:09,830
The probability that the discrete random variable takes on a value in some interval is given by the

80
00:07:09,830 --> 00:07:14,900
area under this probability density curve over that particular interval.

81
00:07:14,900 --> 00:07:20,550
So if we're interested here in the probability that the student's finishing time is between 71 and 80

82
00:07:20,550 --> 00:07:28,760
minutes, that probability can be represented by all of the area under this curve, over this interval.

83
00:07:28,760 --> 00:07:35,000
In other words, if we could geometrically calculate the area under this curve here, we would find

84
00:07:35,000 --> 00:07:42,830
some value between zero and one, and that value for the area would be equivalent to the probability

85
00:07:42,830 --> 00:07:46,550
that the variable takes on some value in that interval.

86
00:07:46,550 --> 00:07:53,990
So to summarize here, instead of using a probability mass function to calculate that a discrete random

87
00:07:53,990 --> 00:08:00,890
variable takes on an exact value, we'll now use a probability density function to calculate the probability

88
00:08:00,890 --> 00:08:06,120
that a continuous random variable takes on some value within a specific interval.

89
00:08:06,140 --> 00:08:11,990
Now, when it comes to the probability density function, there are a couple of things we need to remember.

90
00:08:12,110 --> 00:08:19,550
So it has to be true that the probability density function f or f of X is greater than or equal to zero

91
00:08:19,550 --> 00:08:25,190
for all possible values of X that the continuous random variable can take on.

92
00:08:25,190 --> 00:08:32,390
So that means for all of the values of x between here and here, under the curve of the probability

93
00:08:32,390 --> 00:08:41,570
density function, all of these values of X on this interval have to produce a positive value for F

94
00:08:41,570 --> 00:08:42,200
of X.

95
00:08:42,200 --> 00:08:48,830
And we can think about this curve here as the graph of f of x, which just means that this curve, if

96
00:08:48,830 --> 00:08:55,760
we were to sketch F of x, this curve, the sketch of f of x has to sit above the horizontal axis anywhere.

97
00:08:55,760 --> 00:08:57,830
We can't have any negative values.

98
00:08:57,830 --> 00:09:02,840
This curve can't dip below the horizontal axis at any point in this interval.

99
00:09:02,840 --> 00:09:05,720
So that's our first requirement.

100
00:09:05,930 --> 00:09:12,860
And then our second requirement is that the total area under this curve when we add it all up, has

101
00:09:12,860 --> 00:09:14,360
to be equal to one.

102
00:09:14,360 --> 00:09:18,110
Because remember, our vertical axis here represents probability.

103
00:09:18,110 --> 00:09:25,190
So the area under this curve is the total probability for all possible values of our continuous random

104
00:09:25,190 --> 00:09:25,790
variable.

105
00:09:25,790 --> 00:09:31,970
And probability always has to add to one, which means that all of the area under this curve has to

106
00:09:31,970 --> 00:09:32,780
add to one.

107
00:09:32,780 --> 00:09:36,380
And that's what this means here, this integral notation.

108
00:09:36,380 --> 00:09:42,740
And then from negative infinity to positive infinity just means everywhere under the curve of F of X,

109
00:09:42,740 --> 00:09:45,380
to integrate means to find the area under the curve.

110
00:09:45,380 --> 00:09:49,370
So the area under the curve F of X everywhere underneath it.

111
00:09:49,370 --> 00:09:52,040
When we add all that together, that has to be equal to one.

112
00:09:52,040 --> 00:09:54,470
So that's our second requirement.

113
00:09:54,470 --> 00:09:58,040
So that being said, let's go ahead and look at an example.

114
00:09:58,040 --> 00:09:59,420
We'll say here that.

115
00:09:59,710 --> 00:10:06,860
This function f of x is a probability density function as long as we define it over the interval 0 to

116
00:10:06,860 --> 00:10:07,210
1.

117
00:10:07,210 --> 00:10:13,270
So under this definition, before we move forward, we need to check to see that we meet both of these

118
00:10:13,270 --> 00:10:14,500
criteria here.

119
00:10:14,500 --> 00:10:19,870
That F of X is greater than or equal to zero for all X, and that the total area under the curve is

120
00:10:19,870 --> 00:10:20,800
equal to one.

121
00:10:20,950 --> 00:10:23,380
So there's a couple of ways that we can go about this.

122
00:10:23,380 --> 00:10:30,070
But one easy way to verify this first criteria is to sketch the graph of F.

123
00:10:30,070 --> 00:10:37,510
So if we use a calculator or any computer algebra system, any graphing software to sketch F of X,

124
00:10:37,510 --> 00:10:39,520
we see this graph here.

125
00:10:39,550 --> 00:10:45,010
We're told that we're only looking at the values of X on the interval 0 to 1.

126
00:10:45,010 --> 00:10:49,690
So here this is x equals zero, this is x equals one.

127
00:10:49,690 --> 00:10:55,660
And so we're only interested in this section of the graph between these two points and we can see that

128
00:10:55,660 --> 00:10:57,640
everywhere between zero and one.

129
00:10:57,640 --> 00:11:01,840
The graph here is at or above the horizontal axis.

130
00:11:01,840 --> 00:11:05,530
And so we meet this first criteria.

131
00:11:05,680 --> 00:11:11,440
If we had been given an interval, for instance, of negative one to positive one, we can see that

132
00:11:11,440 --> 00:11:16,690
over here between negative one and zero, the graph sits below the horizontal axis.

133
00:11:16,690 --> 00:11:22,870
And so we would not meet this first criteria and we could not call this a probability density function.

134
00:11:22,870 --> 00:11:28,630
But because the interval has been limited to 0 to 1 and we can see that the graph is at or above the

135
00:11:28,630 --> 00:11:33,460
horizontal axis, everywhere in that interval we meet this first criteria.

136
00:11:33,700 --> 00:11:39,370
Now for the second criteria, in order to figure this out by hand, we're going to need to know how

137
00:11:39,370 --> 00:11:41,320
to evaluate integrals.

138
00:11:41,320 --> 00:11:45,340
And that only comes up in calculus one or calculus two.

139
00:11:45,340 --> 00:11:46,570
But that's okay.

140
00:11:46,570 --> 00:11:48,250
We don't have to know calculus.

141
00:11:48,250 --> 00:11:54,970
We don't have to know how to evaluate an integral because we can still use calculators or computers

142
00:11:54,970 --> 00:11:57,610
to help us verify this fact really quickly.

143
00:11:57,610 --> 00:12:02,800
So all we have to do is substitute here for F of X for x cubed.

144
00:12:02,800 --> 00:12:08,710
So this is what we would plug into a computer algebra system, software, computer to get this integral

145
00:12:08,710 --> 00:12:15,820
evaluated for us, we would say here the integral and then instead of from negative infinity to infinity

146
00:12:15,820 --> 00:12:19,840
in the definition here, the interval has been limited to 0 to 1.

147
00:12:19,840 --> 00:12:22,390
So we'll say 0 to 1.

148
00:12:22,390 --> 00:12:31,810
And then instead of F of x here we'll put in for x cubed our F of x function and then D x again, we

149
00:12:31,810 --> 00:12:39,880
can plug this into any calculator or computer and it will tell us the value of this integral if we don't

150
00:12:39,880 --> 00:12:41,740
know how to do integration.

151
00:12:41,740 --> 00:12:44,470
The value of this integral is in fact one.

152
00:12:44,680 --> 00:12:47,020
And so we meet this criteria.

153
00:12:47,020 --> 00:12:52,810
If we wanted to evaluate this integral by hand, the way that we would do it is by adding one to this

154
00:12:52,810 --> 00:12:56,140
exponent here to get X to the fourth instead of x cubed.

155
00:12:56,140 --> 00:13:01,810
So we would have X to the fourth and then we would divide by this new exponent of four.

156
00:13:01,810 --> 00:13:07,330
So when we divide this function by four, it'll cancel this coefficient of four.

157
00:13:07,330 --> 00:13:11,110
And so the integral of four x cubed is x to the fourth.

158
00:13:11,110 --> 00:13:18,340
And then we would evaluate that result, that integrated result on the given interval 0 to 1.

159
00:13:18,340 --> 00:13:24,340
And when we do, when we evaluate over this interval, we always plug in this upper limit here of one

160
00:13:24,340 --> 00:13:24,940
first.

161
00:13:24,940 --> 00:13:26,620
So we get one to the four.

162
00:13:27,440 --> 00:13:32,660
And then we always subtract whatever we get when we plug in this lower limit of integration.

163
00:13:32,660 --> 00:13:36,350
So when we plug in zero, we get zero to the fourth here.

164
00:13:36,680 --> 00:13:41,450
And of course, when we evaluate that, we get one -0 or one.

165
00:13:41,450 --> 00:13:46,940
And so we see that the value of that integral over this interval 0 to 1 is in fact one.

166
00:13:46,940 --> 00:13:50,270
And so we can double check that we do in fact meet this requirement.

167
00:13:50,270 --> 00:13:52,340
But again, we don't need to know integration.

168
00:13:52,340 --> 00:13:57,800
If we haven't taken calculus, we can use a computer to help us calculate any integral, the integral

169
00:13:57,800 --> 00:14:04,790
of any function F of x over any interval, just to verify that this is one and therefore that we have

170
00:14:04,790 --> 00:14:06,590
a probability density function.

171
00:14:06,590 --> 00:14:12,620
So once we've verified that we have a probability density function, we can answer probability questions

172
00:14:12,620 --> 00:14:18,200
about the continuous random variable, like what is the probability that the continuous random variable

173
00:14:18,200 --> 00:14:23,390
x takes on some value between, let's say, one half and three fourths?

174
00:14:23,390 --> 00:14:31,700
So looking at this curve here, that's this interval here, this is one half and this is three over

175
00:14:31,700 --> 00:14:32,360
four.

176
00:14:32,360 --> 00:14:38,540
So that interval, if we want to calculate that probability, what we're looking for is this area here,

177
00:14:38,570 --> 00:14:41,060
geometrically this physical area.

178
00:14:41,060 --> 00:14:46,940
If we can calculate this area, that will give us the probability that X takes on a value between one

179
00:14:46,940 --> 00:14:48,440
half and three fourths.

180
00:14:48,440 --> 00:14:54,770
And so here's what that would look like if we actually plugged into this formula here, we would say

181
00:14:55,040 --> 00:15:01,250
that the probability that X is between A and B, we said one half and three fourths.

182
00:15:01,250 --> 00:15:09,890
So one half less than or equal to X less than or equal to three over four is equal to the integral from

183
00:15:09,890 --> 00:15:18,440
one half to three fourths of the function F of x, which we were already told was for x cubed and then

184
00:15:18,440 --> 00:15:19,850
D x.

185
00:15:20,090 --> 00:15:23,120
And again, this is the same thing we just did here.

186
00:15:23,120 --> 00:15:29,600
We can use any computer or calculator to help us find this value if we don't know how to integrate or

187
00:15:29,630 --> 00:15:30,950
if we want to integrate.

188
00:15:31,070 --> 00:15:35,360
Remember we said that the integral of four x cubed was x to the fourth.

189
00:15:35,360 --> 00:15:43,580
So the integral here is x to the fourth, and then we just evaluate that integral over this interval,

190
00:15:43,580 --> 00:15:51,230
one half to three quarters, we plug in our upper limit of integration first three over four, so we

191
00:15:51,230 --> 00:15:52,790
get three over four.

192
00:15:53,620 --> 00:15:59,080
To the fourth power, and then we subtract whatever we get when we plug in our lower limit of integration,

193
00:15:59,080 --> 00:15:59,920
which is one half.

194
00:15:59,920 --> 00:16:01,300
So we get one half.

195
00:16:02,300 --> 00:16:03,580
To the fourth power.

196
00:16:03,590 --> 00:16:10,850
And then if we do this math by hand here, we'll get three to the four is 81, and four to the four

197
00:16:10,880 --> 00:16:18,890
is 256, minus one to the four is one, and two to the four is 16.

198
00:16:19,310 --> 00:16:29,180
If we find a common denominator by multiplying this second fraction by 16 over 16, then we can call

199
00:16:29,180 --> 00:16:35,600
this fraction here 16 over 256 instead of this.

200
00:16:35,600 --> 00:16:39,650
And the result then there is 81 -16 is 65.

201
00:16:39,650 --> 00:16:51,380
So 65 over 256, which when we convert to a decimal and round, is approximately 25.4%.

202
00:16:51,380 --> 00:16:56,600
And so what that tells us then is that the probability that our continuous random variable will have

203
00:16:56,600 --> 00:17:01,660
a value between one half and three fourths is just over 25%.

204
00:17:01,730 --> 00:17:08,630
Given that this is the probability density function that models our continuous random variable over

205
00:17:08,630 --> 00:17:10,460
this interval 0 to 1.

206
00:17:10,460 --> 00:17:16,609
So again, the biggest takeaway here is that we use probability mass functions to model discrete random

207
00:17:16,609 --> 00:17:20,300
variables, and we calculate probability of exact values.

208
00:17:20,300 --> 00:17:26,000
When we have a continuous random variable, we instead use a probability density function and we only

209
00:17:26,000 --> 00:17:28,430
calculate probabilities over intervals.

210
00:17:28,430 --> 00:17:33,050
We look for the probability that the continuous random variable falls between A and B.

211
00:17:33,080 --> 00:17:39,920
Instead of that, the random variable takes on a specific value like A or B, And in order to calculate

212
00:17:39,920 --> 00:17:46,310
that probability, we first check that we meet these two conditions to verify that we do in fact have

213
00:17:46,310 --> 00:17:47,930
a probability density function.

214
00:17:47,930 --> 00:17:53,390
And then we calculate this integral in order to find the probability over that interval.

215
00:17:53,390 --> 00:17:57,800
If we've taken calculus one and calculus two and we're comfortable with integration, we can, of course

216
00:17:57,800 --> 00:18:01,280
do that by hand, or if not, no problem at all.

217
00:18:01,280 --> 00:18:07,670
We can use any calculator, online calculator, computer algebra system, etc. to calculate this integral

218
00:18:07,670 --> 00:18:08,360
for us.

219
00:18:08,360 --> 00:18:14,120
The result that we get when we integrate the function f of x over the interval A to B should always

220
00:18:14,150 --> 00:18:17,630
be a value between zero and one.

221
00:18:17,630 --> 00:18:22,370
We might get a fraction, but even if we do, we can convert that to a decimal value.

222
00:18:22,370 --> 00:18:27,770
In this case, we got approximately 0.254.

223
00:18:27,770 --> 00:18:33,290
We can also convert that to a percentage to say there's an approximately 25.4% chance.

224
00:18:33,290 --> 00:18:40,250
Or we could just say the probability is about 0.254 and make the conclusion that that is the probability

225
00:18:40,250 --> 00:18:46,610
that the continuous random variable takes on a value in the interval that we defined.

226
00:18:46,610 --> 00:18:51,020
In this particular example, the interval one half to three fourths.

