1
00:00:00,280 --> 00:00:05,040
In order to make predictions,
we need to choose a classification model.

2
00:00:05,040 --> 00:00:10,530
And we're going to use the decision tree
classifier that we looked at earlier.

3
00:00:10,530 --> 00:00:15,595
Back in your Python file, let's add
to the program on the next few lines.

4
00:00:15,595 --> 00:00:20,850
First, we'll import scikit-learn's module,

5
00:00:20,850 --> 00:00:25,870
which includes all of
the decision tree models.

6
00:00:27,030 --> 00:00:33,270
So we'll type from sklearn import tree.

7
00:00:34,760 --> 00:00:38,520
Next, we'll create a decision
tree classifier and

8
00:00:38,520 --> 00:00:40,860
assign it to a variable so
we can work with it.

9
00:00:42,420 --> 00:00:48,430
We'll type classifier as the name
of my variable, and we'll set that

10
00:00:48,430 --> 00:00:56,260
to tree.DecisionTreeClassifier and

11
00:00:56,260 --> 00:00:58,990
that's a function, so
add parenthesis at the end.

12
00:01:00,330 --> 00:01:05,130
Now, we need to actually build
the DecisionTree through which

13
00:01:05,130 --> 00:01:07,100
each new example will flow.

14
00:01:08,230 --> 00:01:13,130
This decision tree can be built by
feeding it both the training example and

15
00:01:13,130 --> 00:01:17,470
the target labels using the fit function,
like this.

16
00:01:19,600 --> 00:01:27,262
So, again,
we'll use our classifier variable and

17
00:01:27,262 --> 00:01:31,624
we'll type classifier.fit

18
00:01:34,420 --> 00:01:41,400
And inside of this function,
we'll pass the iris.data,

19
00:01:41,400 --> 00:01:46,930
and then the iris.target or the labels.

20
00:01:46,930 --> 00:01:52,000
So just to review,
we've created a decision tree model.

21
00:01:52,000 --> 00:01:56,750
And now we're actually building
the decision tree using its fit function

22
00:01:56,750 --> 00:02:00,610
which takes a set of examples and
the target labels.

23
00:02:00,610 --> 00:02:04,160
For more on the fit function, check out
the notes associated with this video.

24
00:02:05,390 --> 00:02:10,750
At this point the data is loaded and we've
built a decision tree based on that data.

25
00:02:10,750 --> 00:02:15,800
Now we can feed a new example into
the top of that decision tree, and

26
00:02:15,800 --> 00:02:19,930
it will flow through each branching
decision like a flow chart

27
00:02:19,930 --> 00:02:24,240
until it finally reaches its target label.

28
00:02:24,240 --> 00:02:28,470
Now finally comes the part we've been
working toward, making predictions.

29
00:02:28,470 --> 00:02:34,670
We can do this using the predict function
on the decision tree classifier,

30
00:02:34,670 --> 00:02:36,110
like this.

31
00:02:36,110 --> 00:02:39,570
So first, this is something I'm
going to want to print out.

32
00:02:39,570 --> 00:02:47,700
And inside of the print function,
I'll type classifier.predict,

33
00:02:47,700 --> 00:02:52,020
and we'll open and close some parentheses.

34
00:02:52,020 --> 00:02:54,400
I am wrapping this in
a print function just so

35
00:02:54,400 --> 00:02:57,790
that we can see the outcome
of the code when we run it.

36
00:02:58,830 --> 00:03:02,500
Inside of the predict parentheses,

37
00:03:02,500 --> 00:03:06,699
create two sets of nested square brackets,
like this.

38
00:03:08,160 --> 00:03:10,180
So there's the first pair.

39
00:03:10,180 --> 00:03:15,070
And then inside of those square brackets,
we'll make another pair of opening and

40
00:03:15,070 --> 00:03:16,320
closing square brackets.

41
00:03:18,202 --> 00:03:24,412
The outermost set of square brackets
is an array of our examples.

42
00:03:24,412 --> 00:03:27,820
The innermost set of square brackets

43
00:03:27,820 --> 00:03:33,420
is where we'll put the values of
features for a single example.

44
00:03:33,420 --> 00:03:38,060
So in other words, we could predict
multiple examples at a time, but

45
00:03:38,060 --> 00:03:40,710
we're just sticking with one for now.

46
00:03:40,710 --> 00:03:47,430
So let's start out by testing the model
to make sure it's working correctly.

47
00:03:47,430 --> 00:03:51,320
We can do that by just putting
in an example from the data set.

48
00:03:52,340 --> 00:03:53,900
From the Wikipedia page,

49
00:03:53,900 --> 00:03:59,820
I'll type in the first example from
the data set which happens to be a setosa.

50
00:04:01,300 --> 00:04:06,556
So inside of the innermost
square brackets,

51
00:04:06,556 --> 00:04:12,654
I'll type 5.1, 3.5, 1.4, and 0.2.

52
00:04:12,654 --> 00:04:18,983
And if we go back to the Wikipedia
page to look at that,

53
00:04:18,983 --> 00:04:26,450
again, you can see that this
first example should be a setosa.

54
00:04:27,770 --> 00:04:35,420
And now let's save it and then back
in the terminal, we'll run the code.

55
00:04:35,420 --> 00:04:40,280
So I'll just hit the up arrow to get
the previous command and hit Enter.

56
00:04:42,180 --> 00:04:46,496
And the output should be the names
of the flowers because we still have

57
00:04:46,496 --> 00:04:50,740
the Iris.target_names printing first.

58
00:04:50,740 --> 00:04:57,210
And then next, we have this index (0),
which is exactly what we want.

59
00:04:57,210 --> 00:05:01,080
Remember, arrays start
counting indices at 0.

60
00:05:01,080 --> 00:05:06,540
And, in this case,
the index is referring to these labels.

61
00:05:06,540 --> 00:05:13,170
A setosa is 0, versicolor is 1,
and virginica is 2.

62
00:05:13,170 --> 00:05:16,740
So zero is indeed a setosa, and so

63
00:05:16,740 --> 00:05:21,070
we know that the model is
predicting this correctly.

64
00:05:21,070 --> 00:05:24,370
So now let's mess with
this data a little bit.

65
00:05:24,370 --> 00:05:31,370
In the data set, most of the setosas
have a pedal width of about 0.2.

66
00:05:31,370 --> 00:05:36,450
With examples ranging from 0.1 up to 0.6.

67
00:05:36,450 --> 00:05:41,480
However, versicolors
range from 1.0 to 1.8.

68
00:05:41,480 --> 00:05:46,702
And then virginicas have
petal widths from 1.4 to 2.5.

69
00:05:46,702 --> 00:05:51,589
So in our example, let's change this last

70
00:05:51,589 --> 00:05:57,380
feature to something like 1.5 instead.

71
00:05:57,380 --> 00:06:01,860
That would be well above normal for
a setosa, but

72
00:06:01,860 --> 00:06:08,190
match the upper end of versicolors and
just barely make the cut of virginicas.

73
00:06:08,190 --> 00:06:13,334
Now save the code, and let's run it again.

74
00:06:15,830 --> 00:06:19,150
You should see either a 0 or a 1.

75
00:06:19,150 --> 00:06:23,299
If you hit the up arrow and
hit enter, again and

76
00:06:23,299 --> 00:06:29,800
again to execute the same code,
you should see both numbers appearing.

77
00:06:31,185 --> 00:06:35,685
That's because the decision tree
classifier randomly chooses a feature

78
00:06:35,685 --> 00:06:38,865
that it thinks will make
the best comparison.

79
00:06:38,865 --> 00:06:43,805
However, because we're always working
with probabilistic behavior in machine

80
00:06:43,805 --> 00:06:48,975
learning, we won't necessarily get
the same result on every run.

81
00:06:48,975 --> 00:06:53,590
This can indicate a low level
of confidence which make sense.

82
00:06:53,590 --> 00:06:57,170
The other values in our new example

83
00:06:57,170 --> 00:07:00,530
don't line up with any other
examples in the data set.

84
00:07:00,530 --> 00:07:03,490
And we've pushed the pedal width enough

85
00:07:03,490 --> 00:07:07,910
that the classifier can't really
draw any confident conclusion.

86
00:07:07,910 --> 00:07:13,380
It's somewhere between a setosa and
a versicolor.

87
00:07:13,380 --> 00:07:14,490
That's it for coding.

88
00:07:14,490 --> 00:07:18,018
In our next video, we'll review some
of the big ideas we've learned.