1 00:00:00,280 --> 00:00:05,040 In order to make predictions, we need to choose a classification model. 2 00:00:05,040 --> 00:00:10,530 And we're going to use the decision tree classifier that we looked at earlier. 3 00:00:10,530 --> 00:00:15,595 Back in your Python file, let's add to the program on the next few lines. 4 00:00:15,595 --> 00:00:20,850 First, we'll import scikit-learn's module, 5 00:00:20,850 --> 00:00:25,870 which includes all of the decision tree models. 6 00:00:27,030 --> 00:00:33,270 So we'll type from sklearn import tree. 7 00:00:34,760 --> 00:00:38,520 Next, we'll create a decision tree classifier and 8 00:00:38,520 --> 00:00:40,860 assign it to a variable so we can work with it. 9 00:00:42,420 --> 00:00:48,430 We'll type classifier as the name of my variable, and we'll set that 10 00:00:48,430 --> 00:00:56,260 to tree.DecisionTreeClassifier and 11 00:00:56,260 --> 00:00:58,990 that's a function, so add parenthesis at the end. 12 00:01:00,330 --> 00:01:05,130 Now, we need to actually build the DecisionTree through which 13 00:01:05,130 --> 00:01:07,100 each new example will flow. 14 00:01:08,230 --> 00:01:13,130 This decision tree can be built by feeding it both the training example and 15 00:01:13,130 --> 00:01:17,470 the target labels using the fit function, like this. 16 00:01:19,600 --> 00:01:27,262 So, again, we'll use our classifier variable and 17 00:01:27,262 --> 00:01:31,624 we'll type classifier.fit 18 00:01:34,420 --> 00:01:41,400 And inside of this function, we'll pass the iris.data, 19 00:01:41,400 --> 00:01:46,930 and then the iris.target or the labels. 20 00:01:46,930 --> 00:01:52,000 So just to review, we've created a decision tree model. 21 00:01:52,000 --> 00:01:56,750 And now we're actually building the decision tree using its fit function 22 00:01:56,750 --> 00:02:00,610 which takes a set of examples and the target labels. 23 00:02:00,610 --> 00:02:04,160 For more on the fit function, check out the notes associated with this video. 24 00:02:05,390 --> 00:02:10,750 At this point the data is loaded and we've built a decision tree based on that data. 25 00:02:10,750 --> 00:02:15,800 Now we can feed a new example into the top of that decision tree, and 26 00:02:15,800 --> 00:02:19,930 it will flow through each branching decision like a flow chart 27 00:02:19,930 --> 00:02:24,240 until it finally reaches its target label. 28 00:02:24,240 --> 00:02:28,470 Now finally comes the part we've been working toward, making predictions. 29 00:02:28,470 --> 00:02:34,670 We can do this using the predict function on the decision tree classifier, 30 00:02:34,670 --> 00:02:36,110 like this. 31 00:02:36,110 --> 00:02:39,570 So first, this is something I'm going to want to print out. 32 00:02:39,570 --> 00:02:47,700 And inside of the print function, I'll type classifier.predict, 33 00:02:47,700 --> 00:02:52,020 and we'll open and close some parentheses. 34 00:02:52,020 --> 00:02:54,400 I am wrapping this in a print function just so 35 00:02:54,400 --> 00:02:57,790 that we can see the outcome of the code when we run it. 36 00:02:58,830 --> 00:03:02,500 Inside of the predict parentheses, 37 00:03:02,500 --> 00:03:06,699 create two sets of nested square brackets, like this. 38 00:03:08,160 --> 00:03:10,180 So there's the first pair. 39 00:03:10,180 --> 00:03:15,070 And then inside of those square brackets, we'll make another pair of opening and 40 00:03:15,070 --> 00:03:16,320 closing square brackets. 41 00:03:18,202 --> 00:03:24,412 The outermost set of square brackets is an array of our examples. 42 00:03:24,412 --> 00:03:27,820 The innermost set of square brackets 43 00:03:27,820 --> 00:03:33,420 is where we'll put the values of features for a single example. 44 00:03:33,420 --> 00:03:38,060 So in other words, we could predict multiple examples at a time, but 45 00:03:38,060 --> 00:03:40,710 we're just sticking with one for now. 46 00:03:40,710 --> 00:03:47,430 So let's start out by testing the model to make sure it's working correctly. 47 00:03:47,430 --> 00:03:51,320 We can do that by just putting in an example from the data set. 48 00:03:52,340 --> 00:03:53,900 From the Wikipedia page, 49 00:03:53,900 --> 00:03:59,820 I'll type in the first example from the data set which happens to be a setosa. 50 00:04:01,300 --> 00:04:06,556 So inside of the innermost square brackets, 51 00:04:06,556 --> 00:04:12,654 I'll type 5.1, 3.5, 1.4, and 0.2. 52 00:04:12,654 --> 00:04:18,983 And if we go back to the Wikipedia page to look at that, 53 00:04:18,983 --> 00:04:26,450 again, you can see that this first example should be a setosa. 54 00:04:27,770 --> 00:04:35,420 And now let's save it and then back in the terminal, we'll run the code. 55 00:04:35,420 --> 00:04:40,280 So I'll just hit the up arrow to get the previous command and hit Enter. 56 00:04:42,180 --> 00:04:46,496 And the output should be the names of the flowers because we still have 57 00:04:46,496 --> 00:04:50,740 the Iris.target_names printing first. 58 00:04:50,740 --> 00:04:57,210 And then next, we have this index (0), which is exactly what we want. 59 00:04:57,210 --> 00:05:01,080 Remember, arrays start counting indices at 0. 60 00:05:01,080 --> 00:05:06,540 And, in this case, the index is referring to these labels. 61 00:05:06,540 --> 00:05:13,170 A setosa is 0, versicolor is 1, and virginica is 2. 62 00:05:13,170 --> 00:05:16,740 So zero is indeed a setosa, and so 63 00:05:16,740 --> 00:05:21,070 we know that the model is predicting this correctly. 64 00:05:21,070 --> 00:05:24,370 So now let's mess with this data a little bit. 65 00:05:24,370 --> 00:05:31,370 In the data set, most of the setosas have a pedal width of about 0.2. 66 00:05:31,370 --> 00:05:36,450 With examples ranging from 0.1 up to 0.6. 67 00:05:36,450 --> 00:05:41,480 However, versicolors range from 1.0 to 1.8. 68 00:05:41,480 --> 00:05:46,702 And then virginicas have petal widths from 1.4 to 2.5. 69 00:05:46,702 --> 00:05:51,589 So in our example, let's change this last 70 00:05:51,589 --> 00:05:57,380 feature to something like 1.5 instead. 71 00:05:57,380 --> 00:06:01,860 That would be well above normal for a setosa, but 72 00:06:01,860 --> 00:06:08,190 match the upper end of versicolors and just barely make the cut of virginicas. 73 00:06:08,190 --> 00:06:13,334 Now save the code, and let's run it again. 74 00:06:15,830 --> 00:06:19,150 You should see either a 0 or a 1. 75 00:06:19,150 --> 00:06:23,299 If you hit the up arrow and hit enter, again and 76 00:06:23,299 --> 00:06:29,800 again to execute the same code, you should see both numbers appearing. 77 00:06:31,185 --> 00:06:35,685 That's because the decision tree classifier randomly chooses a feature 78 00:06:35,685 --> 00:06:38,865 that it thinks will make the best comparison. 79 00:06:38,865 --> 00:06:43,805 However, because we're always working with probabilistic behavior in machine 80 00:06:43,805 --> 00:06:48,975 learning, we won't necessarily get the same result on every run. 81 00:06:48,975 --> 00:06:53,590 This can indicate a low level of confidence which make sense. 82 00:06:53,590 --> 00:06:57,170 The other values in our new example 83 00:06:57,170 --> 00:07:00,530 don't line up with any other examples in the data set. 84 00:07:00,530 --> 00:07:03,490 And we've pushed the pedal width enough 85 00:07:03,490 --> 00:07:07,910 that the classifier can't really draw any confident conclusion. 86 00:07:07,910 --> 00:07:13,380 It's somewhere between a setosa and a versicolor. 87 00:07:13,380 --> 00:07:14,490 That's it for coding. 88 00:07:14,490 --> 00:07:18,018 In our next video, we'll review some of the big ideas we've learned.