﻿1
00:00:01,400 --> 00:00:07,070
‫So find us is a software library written for the Python programming language for data manipulation and

2
00:00:07,070 --> 00:00:08,000
‫analysis.

3
00:00:08,000 --> 00:00:14,230
‫This is specifically for data manipulation and analysis.

4
00:00:14,240 --> 00:00:16,490
‫First we need to import minus

5
00:00:19,550 --> 00:00:28,560
‫so we'll write import find us as speedy if we are using and I want.

6
00:00:28,790 --> 00:00:33,670
‫And then we're not have automatically installed you find another in your system so you won't need to

7
00:00:33,680 --> 00:00:35,070
‫install it separately.

8
00:00:35,080 --> 00:00:40,470
‫You just have been ordered into your workspace fought by us.

9
00:00:40,480 --> 00:00:47,030
‫We will be using our customer data so yes we find you can find this file in the resources section on

10
00:00:47,050 --> 00:00:48,550
‫this video.

11
00:00:48,550 --> 00:00:53,140
‫So go on download this fight and put it in your folder.

12
00:00:55,050 --> 00:01:06,300
‫We will start by importing a customer see as we find civil right data when this is over variable then

13
00:01:06,300 --> 00:01:16,220
‫we'll write up on notes function to import CSC that is BD dot read underscore the CSP then we provide

14
00:01:16,220 --> 00:01:17,950
‫the location of Odysseus to find

15
00:01:20,850 --> 00:01:24,240
‫remember routine this back slashes into forward slashes

16
00:01:30,740 --> 00:01:34,060
‫then the file name somewhere not CSP

17
00:01:37,720 --> 00:01:42,400
‫and then edits equate those zeros and so what for stroke and been beheaded.

18
00:01:45,030 --> 00:01:56,270
‫Done this will get our people very well done for day to day to undertake really get the first five rows

19
00:01:57,830 --> 00:01:58,640
‫of our data

20
00:02:04,500 --> 00:02:12,360
‫you can see we have a somewhat idea someone a name segment each increase a b c the board and the region

21
00:02:12,520 --> 00:02:20,970
‫as our columns then we have multiple similar details Rose first is of course somewhat I read this is

22
00:02:20,970 --> 00:02:27,690
‫a unique idea for each customer second column is like a similar name here we have someone name for name

23
00:02:28,350 --> 00:02:35,610
‫then there is a segment where the customer belong to and zoom on the segment or corporate segment then

24
00:02:35,610 --> 00:02:42,390
‫we have a column for each that each your customer then the concrete city state or cell board and region

25
00:02:42,420 --> 00:02:50,250
‫of that customer if you want to grab Motorola's you can provide the number and this record by default

26
00:02:50,250 --> 00:02:55,290
‫it is 5 if you write then and would put you in deck then goes

27
00:02:58,070 --> 00:03:01,750
‫now here you are seeing this 0 1 2 3 4.

28
00:03:02,540 --> 00:03:13,260
‫This all are the index off of this table for example zero throws will be this row if you want to know

29
00:03:13,420 --> 00:03:21,180
‫somewhat of a B as an index and so it goes his family we can add it as an index will write this CSC

30
00:03:21,180 --> 00:03:28,020
‫file into another database that is data to really copy the outcome on

31
00:03:33,420 --> 00:03:41,460
‫who will write another parameter that is index underscore column we are providing the location of index

32
00:03:41,460 --> 00:03:50,870
‫column and for our data it discussed somewhat 80 which is the 0 call them off our data since this is

33
00:03:50,880 --> 00:03:57,950
‫the first column the index is 0 that's why we are providing 0 similar to what we provided for this.

34
00:03:58,110 --> 00:04:06,240
‫So what I had done was by then in the war before I get and 0 low vision of it it also or the next column

35
00:04:06,240 --> 00:04:13,050
‫is 0 if we run this and again if we run the hurdle is

36
00:04:17,510 --> 00:04:24,020
‫you can see now 0 1 2 3 4 indexes that have on and now this out of our indexes

37
00:04:28,010 --> 00:04:35,570
‫these are important and we will discuss I ordered a short way now Hey come on let's use to you the sample

38
00:04:35,570 --> 00:04:44,240
‫of your data if you want to know statistics of your data you write data one don't describe

39
00:04:46,850 --> 00:04:53,830
‫don't describe this game is a killer and then this

40
00:04:57,750 --> 00:05:00,970
‫so there are only two in beta we are losing our data.

41
00:05:01,440 --> 00:05:09,190
‫That's why we are only getting two columns year for season and second this post syllable.

42
00:05:09,350 --> 00:05:12,210
‫Here you can see the total account of value.

43
00:05:12,250 --> 00:05:17,310
‫The mean not alternate the standard deviation of age minimum made maximum age.

44
00:05:17,360 --> 00:05:18,810
‫These are the percentile value.

45
00:05:18,810 --> 00:05:26,690
‫This the 25 percentile value so if you arrange all the agent ascending order the value present under

46
00:05:26,680 --> 00:05:31,100
‫25 percentile of their data is this value.

47
00:05:31,940 --> 00:05:39,380
‫Similarly this is the 50 percent and also known as the median value is the 70 percentile value of it

48
00:05:39,750 --> 00:05:44,210
‫and is the maximum value of each.

49
00:05:44,300 --> 00:05:51,500
‫This very discussed and unique period analysis which we will be covering in the later part of this course

50
00:05:52,850 --> 00:05:56,830
‫there are two ways to index our data stream.

51
00:05:57,740 --> 00:06:04,400
‫So we discuss a little while importing this data we can provide index next column for our data one index.

52
00:06:04,400 --> 00:06:11,490
‫We did not provide any index column and four data to our index column it's got some variety.

53
00:06:12,140 --> 00:06:20,660
‫So if you want to view the first rule of our data we have to IWC lock or I lock you.

54
00:06:20,660 --> 00:06:32,900
‫If we use data one that I lock and then we provide 0 what I lock we do is it will grab the data that

55
00:06:32,900 --> 00:06:43,620
‫is present in the z index of our data frame so our output is same as what the first scroll is of our

56
00:06:43,620 --> 00:06:45,170
‫data frame.

57
00:06:45,360 --> 00:06:52,820
‫If we want to use the index column which we defined by creating or determining we have to use lock

58
00:06:57,770 --> 00:07:06,110
‫and in the bracket if we write the cost somewhat 80 C D 1 2 5 0 0

59
00:07:09,060 --> 00:07:14,210
‫in the attack to be defined our index column as cost somewhat 80.

60
00:07:14,250 --> 00:07:22,860
‫So now we can use LOC keyword to get the date of this somewhat 80 feet on this.

61
00:07:23,100 --> 00:07:28,560
‫You can see we are getting all the details of our cost somewhat except the cost somewhat ironic since

62
00:07:28,710 --> 00:07:30,500
‫this is the index column.

63
00:07:31,060 --> 00:07:39,900
‫Similarly if I don't know this 80 and I just wanted to grab the first customer I can use I log

64
00:07:51,350 --> 00:07:58,410
‫on so I am getting the same in here I was using I along with I look you have to use the serial number

65
00:07:58,680 --> 00:08:00,750
‫0 and so on.

66
00:08:00,750 --> 00:08:05,910
‫With lock you can use the index column that you throw away.

67
00:08:06,180 --> 00:08:14,820
‫So if you know the position you can use a lock and if you know the value you can use the locks

68
00:08:17,420 --> 00:08:23,540
‫just like in less than the time frame you can also mention multiple values using called an operator.

69
00:08:23,570 --> 00:08:36,190
‫So for example if I write data to that boat I lock 0 column 5.

70
00:08:36,570 --> 00:08:44,730
‫This will give me that data off first five rows where the index will Lewis 0 1 2 3 and 4.

71
00:08:44,760 --> 00:08:47,850
‫Remember 5 is excluded from this research.

72
00:08:48,370 --> 00:09:00,740
‫So I'm getting data of this paper summer you can use steps as one if I tried to run this I'm getting

73
00:09:00,740 --> 00:09:04,940
‫on three days since I'm using the steps.

74
00:09:05,870 --> 00:09:14,340
‫That's all on the we will be using find a lot more by doing our work and would discuss new topics than

75
00:09:14,370 --> 00:09:14,780
‫that on.

