1 00:00:00,002 --> 00:00:02,003 - [Instructor] So there are a couple of different ways 2 00:00:02,003 --> 00:00:06,009 to read the content of a CSV file into a Python program. 3 00:00:06,009 --> 00:00:09,005 In this example, we'll see how to read a CSV file 4 00:00:09,005 --> 00:00:12,006 into an array of arrays. 5 00:00:12,006 --> 00:00:15,007 And to do this, we'll need to use the CSV module 6 00:00:15,007 --> 00:00:18,004 in the Python Standard Library, 7 00:00:18,004 --> 00:00:20,009 and I've pulled that up here at this link. 8 00:00:20,009 --> 00:00:23,002 You can find this in the Python documentation, 9 00:00:23,002 --> 00:00:26,004 and I would suggest leaving this open in a browser window 10 00:00:26,004 --> 00:00:30,002 so you can refer back to it as we go through the example. 11 00:00:30,002 --> 00:00:33,003 Over in my example code, 12 00:00:33,003 --> 00:00:36,002 let's switch over to Visual Studio Code, 13 00:00:36,002 --> 00:00:39,008 I'm going to open up read_csv_array.py. 14 00:00:39,008 --> 00:00:43,004 You can see that I've imported the CSV module 15 00:00:43,004 --> 00:00:46,008 so we can make use of the code that it contains. 16 00:00:46,008 --> 00:00:49,005 For this example, we're going to use this file right here, 17 00:00:49,005 --> 00:00:54,003 inventory.csv as our sample data to read. 18 00:00:54,003 --> 00:00:57,006 It contains information about a variety of things 19 00:00:57,006 --> 00:00:59,004 you might find in a supermarket. 20 00:00:59,004 --> 00:01:02,006 The item name, its category, quantity in stock, 21 00:01:02,006 --> 00:01:07,002 and the wholesale and consumer prices for each item. 22 00:01:07,002 --> 00:01:10,001 I'm going to create a function to read the file 23 00:01:10,001 --> 00:01:12,004 and return an array of data. 24 00:01:12,004 --> 00:01:15,001 So let's go back to the code. 25 00:01:15,001 --> 00:01:16,002 All right, so this is the array 26 00:01:16,002 --> 00:01:18,003 that I'm going to return right here. 27 00:01:18,003 --> 00:01:19,008 So let's fill out this function. 28 00:01:19,008 --> 00:01:22,009 First, we open the file in read mode. 29 00:01:22,009 --> 00:01:24,009 So I'm going to write with open, 30 00:01:24,009 --> 00:01:28,008 and then I'm going to open the file name, 31 00:01:28,008 --> 00:01:34,004 and we're going to open that in read mode as, 32 00:01:34,004 --> 00:01:36,007 and I'll call that CSV file. 33 00:01:36,007 --> 00:01:38,003 So file name is the argument 34 00:01:38,003 --> 00:01:40,008 that's passed into the function right here. 35 00:01:40,008 --> 00:01:44,001 Then we need to create a new reader object 36 00:01:44,001 --> 00:01:47,006 using the CSV module to actually read the data. 37 00:01:47,006 --> 00:01:50,007 So that reader will then supply each row for us to read 38 00:01:50,007 --> 00:01:54,008 and add to the end of the data variable. 39 00:01:54,008 --> 00:02:01,007 So I'm going to create a reader from the CSV modules reader, 40 00:02:01,007 --> 00:02:06,002 and I'm going to pass in the CSV file that I just created. 41 00:02:06,002 --> 00:02:10,001 Now we need to loop over each row in the reader. 42 00:02:10,001 --> 00:02:13,000 So for each row of data in the reader, 43 00:02:13,000 --> 00:02:19,003 I'm going to append that row onto my data array. 44 00:02:19,003 --> 00:02:22,001 And then finally, we just return the result. 45 00:02:22,001 --> 00:02:24,005 So I'll return the data. 46 00:02:24,005 --> 00:02:27,003 Alright, so when the data has been read, 47 00:02:27,003 --> 00:02:29,004 we can inspect the content. 48 00:02:29,004 --> 00:02:32,007 So let's print out some of the data content. 49 00:02:32,007 --> 00:02:35,004 And remember, this data variable itself 50 00:02:35,004 --> 00:02:39,003 is a list that contains other lists. 51 00:02:39,003 --> 00:02:43,001 So each list in the array is a single row of data. 52 00:02:43,001 --> 00:02:45,007 So let's print out some interesting things. 53 00:02:45,007 --> 00:02:50,001 First, let's print out how many items there are in the list. 54 00:02:50,001 --> 00:02:54,004 So I'll call the length function on the inventory data. 55 00:02:54,004 --> 00:02:57,001 And then let's print out the first row, 56 00:02:57,001 --> 00:03:03,007 so I'll print out inventory data at index zero. 57 00:03:03,007 --> 00:03:07,008 Let's also print out the next one as well. 58 00:03:07,008 --> 00:03:09,007 So we'll print out row one. 59 00:03:09,007 --> 00:03:12,007 So row zero will be the header information. 60 00:03:12,007 --> 00:03:16,005 If you refer back to the file, this is row zero right here. 61 00:03:16,005 --> 00:03:20,002 So it's going to be the headers for the individual rows, 62 00:03:20,002 --> 00:03:24,007 and then row one should be this apple right here. 63 00:03:24,007 --> 00:03:27,002 And then let's print out some detailed data. 64 00:03:27,002 --> 00:03:30,004 Let's print out inventory data 65 00:03:30,004 --> 00:03:33,009 in row one at index zero. 66 00:03:33,009 --> 00:03:36,003 So that should be the word "Apple." 67 00:03:36,003 --> 00:03:41,004 And then let's also print out row one at index two, 68 00:03:41,004 --> 00:03:42,007 and that should be, let's see, 69 00:03:42,007 --> 00:03:45,001 index two is going to be the quantity. 70 00:03:45,001 --> 00:03:48,008 Alright, so let's go ahead and save this. 71 00:03:48,008 --> 00:03:50,001 And I'm going to run this. 72 00:03:50,001 --> 00:03:51,009 What I'm going to do is I'm just going to right click, 73 00:03:51,009 --> 00:03:54,003 and because I have that Python extension installed, 74 00:03:54,003 --> 00:03:58,000 I'm going to choose run Python file in terminal. 75 00:03:58,000 --> 00:03:59,008 And sure enough, when we do this, 76 00:03:59,008 --> 00:04:02,007 you can see that the result shows 51 rows. 77 00:04:02,007 --> 00:04:04,005 So there's 50 rows of actual data, 78 00:04:04,005 --> 00:04:07,007 and that includes the head of rows, there's 51 total rows. 79 00:04:07,007 --> 00:04:09,000 And we can see sure enough 80 00:04:09,000 --> 00:04:11,002 that the first row is the categories 81 00:04:11,002 --> 00:04:15,004 and the second row is the first row of data, 82 00:04:15,004 --> 00:04:19,002 and then we can access the individual data items. 83 00:04:19,002 --> 00:04:22,000 All right, first example, done and dusted.