WEBVTT

00:01.510 --> 00:09.250
Hello again! In this video, we are going to look at stream member functions and state. In the last video,

00:09.250 --> 00:11.980
I used the open member function without explaining it.

00:13.030 --> 00:15.970
Maybe it is obvious, but let's quickly go over it.

00:16.600 --> 00:23.440
So, when we opened a file before, we did that by passing the name of the file as the argument to a stream

00:23.440 --> 00:24.070
constructor.

00:24.520 --> 00:27.940
So when the stream is constructed, it will also open the file.

00:29.260 --> 00:35.620
Alternatively, we can just create an empty stream object, so that is not associated with any file.

00:36.130 --> 00:40.570
And then later on, we can call the open member function with the file name as argument.

00:40.960 --> 00:46.150
So this will open the file. And in terms of accessing the data through the stream, it works exactly

00:46.150 --> 00:46.750
the same way.

00:47.530 --> 00:54.070
So in both cases, the file mode will be applied; the default if we do not give one, or the one

00:54.070 --> 00:54.530
that we give,

00:54.550 --> 01:01.600
if we do give one as an argument to the constructor or open. If the file is an output file and does

01:01.600 --> 01:03.460
not already exist, it will be created.

01:04.150 --> 01:08.080
If there is data already in it, it will be truncated by default. And so on.

01:11.460 --> 01:15.840
We checked whether the file was open, by using the stream name in an if statement.

01:16.560 --> 01:22.770
We can also call the is_open member function, and this will return true if the file is open.

01:23.460 --> 01:26.850
So if this returns true, we can go ahead and use the file.

01:30.340 --> 01:36.340
C++ streams have a state, and the state will depend on the result of the last operation.

01:36.610 --> 01:40.030
Usually you are only interested in this for input streams.

01:42.440 --> 01:48.700
So the state of an input stream will depend on the result of the last time that it read data. That it fetched data from

01:48.710 --> 01:50.030
the keyboard or a file.

01:50.990 --> 01:58.760
If the operation succeeded, then the stream is in a good state, and good(() will return true. If the operation

01:58.760 --> 02:00.680
fails, but it is nothing too serious.

02:00.680 --> 02:03.890
For example, data in the wrong format. Then fail()

02:03.890 --> 02:04.850
will return true.

02:05.720 --> 02:11.240
And if something really serious happened like a disk failure, then bad() will be true and the stream

02:11.240 --> 02:12.290
will be in a bad state.

02:12.830 --> 02:14.450
And presumably so is your disk!

02:19.080 --> 02:20.970
So let's quickly try that out.

02:21.300 --> 02:28.440
So we are going to read some data from the keyboard into an int. And then, if this works, we should get

02:28.440 --> 02:28.860
this.

02:29.040 --> 02:29.970
good() should be true.

02:30.600 --> 02:34.680
If we have something that is not easily convertible, then fail() will be true.

02:35.400 --> 02:37.410
And we hope that bad() will not be true.

02:43.580 --> 02:44.270
Enter a number...

02:46.700 --> 02:46.980
Yeah.

02:47.330 --> 02:47.600
You entered number

02:47.600 --> 02:48.210
55.

02:48.230 --> 02:53.480
So the conversion has succeeded. 55 can be converted to an int. Which is nice to know!

02:54.860 --> 02:57.830
good() is true and the program went into this branch.

03:01.220 --> 03:03.320
Let's try something that's not a number.

03:05.540 --> 03:07.630
So we can see that fail() is true.

03:07.640 --> 03:10.770
"banana" can not be converted to an int,

03:11.270 --> 03:16.550
and we got this message. And I am not going to try and replicate the bad() case!

03:20.360 --> 03:25.670
There are a couple of other member functions. clear() will reset the state of the stream to valid.

03:26.300 --> 03:32.630
It does not actually change the buffer, though. eof() will return true after the stream has reached the

03:33.050 --> 03:33.410
end of file,

03:33.440 --> 03:40.040
the end of input. And a lot of people when they learn C++, they think, Aha, I can write a loop like

03:40.040 --> 03:40.340
this.

03:40.850 --> 03:42.620
While not end of input,

03:42.980 --> 03:44.030
I can read some data.

03:44.900 --> 03:46.670
And is this a good thing to do?

03:50.650 --> 03:52.060
So let's find out.

03:53.470 --> 03:59.980
So I have a file here. [It] just contains the numbers from 0 to 9, one on each line.

04:00.880 --> 04:02.770
And then we are going to open this file.

04:02.770 --> 04:08.500
And then, while we have not reached the end of file, we are going to read a number from this file and

04:08.500 --> 04:09.100
print it out.

04:13.450 --> 04:18.280
And we are we get numbers from zero to nine, as we expect. So, yep, this is a good way to do things(?)

04:23.270 --> 04:27.840
If I now add a new line at the end of this file, which should not make any difference at all.

04:29.120 --> 04:30.020
Let's see what happens.

04:32.760 --> 04:36.030
Well, we get the numbers from 0 to 9, but we get another number nine.

04:36.840 --> 04:37.830
So what is happening?

04:41.150 --> 04:47.120
If we think about what is happening in the loop. So imagine we have just read the number 9. So x is

04:47.120 --> 04:49.410
9 and it's going to print out 9.

04:50.360 --> 04:52.970
Then on the next time through, it is going to get the next character.

04:55.110 --> 04:56.790
Which will be this new line character.

04:59.690 --> 05:05.390
The stream will ignore that. The value of x is still 9, so it will print 9 again. And then the

05:05.390 --> 05:06.920
file has now reached the end of file.

05:07.460 --> 05:11.630
So on the next iteration, when it checks for end of file, that will return true.

05:11.990 --> 05:16.070
So it does get to the end of file eventually, but it has actually done an extra iteration.

05:20.490 --> 05:25.350
So the correct way to do this loop is to do it like that.

05:28.280 --> 05:32.630
So this operation, this right shift, will actually return the state of the file.

05:33.440 --> 05:41.570
So while the state of the file [stream] is good, then it is going to print out the value that it has just read.

05:42.350 --> 05:46.430
And then if the file stream is bad or has reached the end of file, this will be false.

05:46.790 --> 05:48.290
So it does not print anything at all.

05:48.620 --> 05:52.670
So we do not get the extra iteration which prints out the data that we do not want.

05:56.440 --> 05:59.410
And there we are, we get the correct results, 0 to 9.

06:00.370 --> 06:06.190
And in fact, if we introduce something in the file which will cause the stream [read] to [fail], then that will

06:06.190 --> 06:07.090
get picked up as well.

06:08.740 --> 06:09.410
So there we are.

06:10.150 --> 06:16.360
It does the first three numbers, and then it finds that there is something here which cannot be converted.

06:16.660 --> 06:21.340
So the stream state is fail, and this loop will terminate.

06:24.510 --> 06:27.990
So we need to think about how we are going to deal with invalid input.

06:29.250 --> 06:35.790
The obvious approach is to have some sort of Boolean which will tell us whether the input is true or

06:35.790 --> 06:36.020
not.

06:36.120 --> 06:41.280
Then we have a loop which checks that, and this loop will keep repeating until the user enters some

06:41.280 --> 06:42.090
valid input.

06:43.290 --> 06:50.970
So we are going to read a number from the keyboard and then, if this is a valid number, good() will return

06:50.970 --> 06:51.360
true.

06:51.930 --> 06:54.810
And then we can set the Boolean flag to true.

06:55.620 --> 06:58.930
If the user enters something which is not a number, then fail()

06:58.950 --> 07:02.640
will be true. And we can ask them to try again, and read their next effort.

07:03.360 --> 07:06.390
And then we can keep on looping until they get it right!

07:11.230 --> 07:17.310
So I will enter something which is a number, good() will be true, and it should say "You entered the number"

07:18.250 --> 07:18.600
Yes!

07:19.360 --> 07:22.660
And then the flag will be true and then the loop will terminate.

07:27.820 --> 07:30.140
If I enter something which is not a number...

07:34.560 --> 07:36.180
So in fact, we have...

07:38.260 --> 07:39.880
Yes, we have an infinite loop.

07:40.390 --> 07:41.470
So why is that?

07:43.590 --> 07:49.800
The problem is that the stream goes into a fail state when it gets invalid input, and it will remain

07:49.800 --> 07:52.620
in the fail state indefinitely.

07:53.280 --> 07:56.160
So we need to call clear() to reset the state of the stream.

08:04.610 --> 08:05.600
And it still does not work!

08:06.620 --> 08:07.670
So what is going on?

08:10.290 --> 08:16.290
We need to look a bit more closely at what happens when this statement is executed: cin, right shift,

08:16.290 --> 08:16.710
x.

08:18.680 --> 08:25.820
The stream will go to the first character in its input buffer, and it will start trying to convert the

08:26.510 --> 08:30.980
characters in that buffer to an int. And it will only stop when it encounters some white space.

08:32.270 --> 08:39.050
If the first character is something that cannot be converted into an int, it will stop immediately

08:39.320 --> 08:40.910
and leave that character in the buffer.

08:41.780 --> 08:45.800
Then the next time through, it'll go to the first character in the input, which is still the same one

08:45.800 --> 08:48.800
that made it fail before, and that will make it [fail] again.

08:49.190 --> 08:53.300
So the problem is that it keeps reading the same user input, over and over again.

08:53.720 --> 08:56.750
It does not just discard the user input and go onto the next one.

09:00.170 --> 09:06.350
So the solution is to force cin to flush its buffer, to get rid of all the stale [characters] from the last

09:06.350 --> 09:10.160
user input, and then it can start processing the new lot of user input.

09:12.760 --> 09:20.140
Unfortunately, input streams do not support flush, but there is a member function called ignore(). And that will

09:20.230 --> 09:22.000
remove characters from the buffer.

09:23.200 --> 09:24.520
How many characters does it remove?

09:24.550 --> 09:27.520
Well, we can tell it. So we can say, "remove 20 characters."

09:28.270 --> 09:29.350
But there is a problem there.

09:29.350 --> 09:35.080
If the user types in five characters and then presses the return key, then this "ignore" is going to

09:35.080 --> 09:36.760
ignore the next 20 characters.

09:37.360 --> 09:42.310
So the five characters that the user types are going to be ignored, and then the 15 characters

09:42.490 --> 09:48.400
after that are going to be ignored. Which is not what we want. So we can provide a second argument, which

09:48.400 --> 09:49.630
is a new line character.

09:50.170 --> 09:54.850
So this means if you get a new line before 20 characters, then stop there.

09:55.180 --> 10:00.220
So this will either remove the next 20 characters or everything up to the next new line. Whichever

10:00.220 --> 10:05.800
comes first. And the characters which are removed from the buffer are thrown away.

10:05.830 --> 10:07.930
You can cannot access them again, they are no use.

10:08.380 --> 10:09.130
They do not exist.

10:10.030 --> 10:16.810
Okay, so I have added that to the program. So we are now ignoring the next 20 characters or up to the

10:16.810 --> 10:18.220
new line, whichever happens first.

10:21.730 --> 10:23.620
So we just check it still works with numbers.

10:24.190 --> 10:24.640
Yes.

10:28.240 --> 10:28.630
"Banana"

10:29.170 --> 10:30.940
So, "Please try again and enter a number".

10:32.080 --> 10:33.070
So that seems to work.

10:36.020 --> 10:39.980
OK, and it is still waiting for input, so I need to press control-C to stop it.

10:44.660 --> 10:47.210
What happens if I put in more than 20 characters?

10:50.960 --> 10:55.700
So it will ignore the first 20 characters. Those are going to be removed and thrown out of the buffer.

10:56.060 --> 11:01.700
But there are still some characters left in the buffer, which are going to cause the read operation

11:01.790 --> 11:02.330
to fail.

11:03.050 --> 11:05.840
So we need to increase the number a bit.

11:06.350 --> 11:07.130
So maybe,

11:08.380 --> 11:13.090
we should make it 100. But someone might type in 100 characters. What about a thousand, that should

11:13.090 --> 11:13.420
be enough!

11:13.420 --> 11:15.400
No one is going to sit there and type a thousand characters.

11:16.090 --> 11:21.820
Well, maybe not. But maybe someone might post in the contents of a file by mistake, or their

11:21.850 --> 11:25.210
cat might see sit on the keyboard and generate lots of keystrokes.

11:25.840 --> 11:31.360
So the best way to do this would be to use the buffer size, so any data that might be in the buffer

11:31.360 --> 11:32.500
at all, no matter how much.

11:33.580 --> 11:37.840
So we are going to completely discard the entire buffer and start with the empty buffer.

11:40.900 --> 11:46.330
And C++ does actually provide an expression for finding that. It is, rather an ugly one.

11:47.890 --> 11:49.600
So it is in numeric limits.

11:49.960 --> 11:52.870
This requires the limits header, so let's include that...

11:59.360 --> 12:05.140
And then it is the numerical limits for the stream size and we want the max.

12:05.390 --> 12:09.200
So this will give us the maximum number of characters that can be stored in the buffer.

12:09.890 --> 12:13.550
So if we remove all those, then we will have an empty buffer.

12:14.000 --> 12:18.470
Or if there is a new line before that, so that will just discard the next bit of input.

12:19.520 --> 12:20.450
So let's try that.

12:26.060 --> 12:29.180
So we now have the cat sitting on the keyboard.

12:30.040 --> 12:34.520
(Miaow!) OK, I think that is enough. I am sure you get the idea.

12:35.870 --> 12:36.960
So there you are.

12:36.980 --> 12:41.660
We only get prompted once, so everything in there was thrown away.

12:46.840 --> 12:48.340
Okay, so that is it for this video.

12:48.640 --> 12:49.440
I will see you next time.

12:49.660 --> 12:51.250
Meanwhile, keep coding!
