Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Preview
Start a free Courses trial
to watch this video
You can create an array of booleans and then use that to index into your array. Let's use this to filter our values.
Learn more
My Notes for Indexing
## Creation
* You can create a random but bound grouping of values using the `np.random` package.
* `RandomState` lets you seed your randomness in a way that is repeatable.
* You can append a row in a couple of ways
* You can use the `np.append` method. Make sure the new row is the same shape.
* You can create/reassign a new array by including the existing array as part of the iterable in creation.
## Indexing
* You can use an indexing shortcut by separating dimensions with a comma.
* You can index using a `list` or `np.array`. Values will be pulled out at that specific index. This is known as fancy indexing.
* Resulting array shape matches the index array layout. Be careful to distinguish between the tuple shortcut and fancy indexing.
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
All right, before we get started here,
0:00
I thought I'd share my
notes as a quick refresher.
0:02
We talked about creation and we saw a new
way to build a random grouping of numbers.
0:05
And we used RandomState to let us seed the
randomness in a way that was repeatable.
0:10
I have the same random values as you do.
0:14
That is super handy.
0:16
And we also learned that you can
append a row in a couple of ways.
0:18
There's even more that
we haven't seen here.
0:21
You can append, use the np.append method.
0:23
You need to make sure
that it's the same shape.
0:26
Remember we did that little hack
where we wrapped it in a list.
0:28
And you can create and
0:31
reassign a new array by including existing
arrays as part of the iterable right?
0:33
So you can throw the new
array in there and
0:37
reassign it cuz you can't change the size.
0:39
And we looked at indexing.
0:42
And there's that nice indexing shortcut
for the multidimensional array,
0:43
remember where you can use the comma.
0:46
So you can say like 3,4 and
that's really row three, column four.
0:48
And instead of having to use the hard
brackets you can use the commas and
0:53
it creates a tuple automatically.
0:55
And you can also index using a list or
another np array and the values will be
0:57
pulled out in that specific index,
and that's known as fancy indexing.
1:02
The resulting array comes back and
it's the same shape as what you asked for.
1:06
But it's very important to
remember to look for lists or
1:10
arrays versus just using
a tuple with a comma.
1:13
All right, so now where were we?
1:18
All right,
we wanted to look at our study log and
1:20
find hours that were just about an hour,
but not quite.
1:23
So let's get back down into
where we did all that work, so.
1:28
So here's the study minutes,
I'm gonna get rid of this last one here.
1:31
To delete a cell, Escape, and then D, D.
1:35
Here we go, so we have our study
minutes array is all written out.
1:38
And remember we were using this fake_log.
1:41
So let's start with that fake_log,
1:44
cuz that definitely has some
values that we know are under 60.
1:45
And remember,
that's what we're looking for
1:50
because they don't count
towards the challenge.
1:51
The concept being there, that is if we
saw those ones that almost made it,
1:53
maybe they'd provide a little inspiration
for us to stick with it for that next day.
1:58
So the np array object is pretty powerful.
2:02
Just about every comparison
operator has been overridden.
2:07
So let's take that fake_log object that
we're using, cuz it's one-dimensional.
2:11
So it's a one-dimensional array, and
it's filled with 100 random values.
2:15
So we'll do fake_log, and check this out.
2:19
I am looking for
values that are less than 60,
2:25
because you know there
are 60 minutes in an hour.
2:27
So I can just write that.
2:29
So fake_log < 60.
2:31
And what will happen is,
Is it's not defined,
2:33
this might happen to you sometimes.
2:38
So this is good,
I'm glad that this happened.
2:40
So what we can do is we can go ahead and
run Kernel, and say Restart & Run All.
2:41
And that popped up our help
cuz we left the help up there.
2:52
So I'm gonna go ahead and close this.
2:54
And so
what happened is we've got this fake_log.
2:55
And what will happen is you
remember that we have these values.
2:59
So the first one that's there is
this fourth, so the fourth value.
3:03
So if we go False, False, False, True and
then looks like again at the eighth there,
3:07
so there's some more false,
false, false, true.
3:12
So what's happening is that it's comparing
every single one of these values and
3:14
it's showing us true where it is.
3:19
Every value is represented.
3:21
And any place that we see a True,
3:23
it is means that it is true
that it's less than 60.
3:25
And that probably doesn't
seem all that handy.
3:29
Well, that is until you find out
that you can do fancy indexing with
3:32
a Boolean array.
3:36
The way that it works is that as long
as the Boolean array lines up with your
3:38
other array,
any value where True exists will be kept.
3:43
So, here check this out.
3:46
So this is what we want, right?
3:48
We want to say,
anything from the fake_log,
3:49
we will use that Boolean
array as a fancy index.
3:55
There we go, we pulled it all
out every value that was True.
4:00
That's exactly what we are looking for,
right, these are all not quite 60.
4:05
Pretty cool, right?
4:10
We did that filtering all without a loop.
4:11
You could totally accomplish this same
thing by saying something like a list
4:14
comprehension, or even something similar
like this, this really simple loop.
4:19
So say results equals this,
let's iterate through each of the values.
4:22
So for value in fake_log, if, here we go.
4:26
If the value is less than 60, then
we're gonna say results.append(value).
4:31
And then just to get back exactly
the same thing we'll just use it.
4:38
We'll say np.array(results), right?
4:41
So there's a loop that we had to write,
and obviously we got back the same thing.
4:43
But using a Boolean array index, is orders
of magnitudes faster than this for loop?
4:49
And look at the code difference too.
4:55
Something you might be wondering is what
happens with multidimensional arrays,
4:57
like our study in minute array.
5:02
Well the good news is, it just works.
5:04
So if we say study minutes less than 60,
you'll see back
5:07
that we get an array, a Boolean array that
is of the exact same shape as our array.
5:12
So that's 3 by 100.
5:17
And of course,
we can use that array as an index.
5:21
So let's do that as well, so
we can say study_minutes,
5:26
where the study_minutes is less than 60.
5:30
Boom, now notice that we're
returned a one dimensional array.
5:35
Not our original three dimensional array,
it's all of the values that match.
5:42
Now we could rewrite this as a nested for
5:46
loop of the same time type
that we did before, right.
5:48
Like we could loop through each round and
then loops through each day and
5:51
adds into our results.
5:55
But we don't need to do that because
this is done all for us without a loop.
5:56
That's kind of gross that's
a bunch of zeros, right?
6:02
If we're looking to motivate ourselves and
we really don't wanna see these zeros.
6:06
What we really wanna see is anything
that's less than 60 minutes and
6:10
greater than 0.
6:16
That gets minutes from days where
we worked a little bit at least.
6:17
So we want to make two
Boolean index arrays.
6:21
Like we wanna make this
study_minutes array, this one.
6:24
We wanna make that array,
the study_minutes where it's less than 60.
6:28
And we also wanna have another
one where the index array is
6:32
study_minutes greater than 0.
6:36
And then we actually want to have the
results where it's a combination of those
6:39
added together.
6:43
You could actually compare arrays
together element by element,
6:44
which is what we want to do.
6:48
So, I'm gonna come back here.
6:50
Let's just manually, we'll go ahead and
6:51
we'll manually create an array,
a Boolean array of False, True, True.
6:54
And to compare, we used the bit wise
operator for and, the ampersand.
7:00
Now this is not the and
keyword, it's an ampersand.
7:07
Now common mistake is, [LAUGH] to
forget and use the and keyword, and
7:11
we'll explore what happens
in here in a bit about that.
7:14
And then I'll create another Boolean array
that we can compare it to, so np.array,
7:17
and we'll put in True, False, True.
7:21
So what happens is we get a brand new
array with each element added together.
7:27
So remember,
when you're checking Boolean logic,
7:33
both sides need to be true
to be considered true.
7:35
So, looking here we have False and
True, and that's False,
7:39
and then we have True and False.
7:44
And that of course is False as well
because they're not both True, and
7:46
then we have True and
True, definitely True.
7:50
So if we go ahead and we run this,
we'll see that we get back a single
7:54
array with the values anded together,
False, False, True, just like we saw.
7:59
So we could use this result as
a Boolean index array, right?
8:04
Do you see how we can just build
the Boolean index array together?
8:10
Values that we want to chain together with
all of other conditions in a series of
8:13
ands and ors?
8:17
Before we use it, I do wanna show you what
happens if you forget to use the bit wise
8:18
and, as the resulting error is
a little confusing at first.
8:22
So depending on how times you
have joined logical expressions,
8:25
your muscle memory might actually
accidentally type the and key word here.
8:28
So let's do that,
let's put this last and key word here.
8:33
Yak, ValueError and
8:34
it's saying the truth value of an array
with one more elements is ambiguous.
8:39
So, what it's trying to do is it's trying
to figure out a truthiness of this, and
8:44
that's what and does.
8:49
It creates a truthiness, and if it's
assuming that we wanna have a scalar
8:51
value, which is not what we want,
we wanna compare element by element.
8:55
So if you did wanna get a scalar value,
9:00
if you wanted to see that everything
was true, you would use a.all and
9:02
that returns a Boolean or
any if there's any true in there at all.
9:05
All that to say,
just use bit wise operation.
9:08
So just go ahead,
use a bit wise operation.
9:11
I just thought I'd preemptively warn
you about this, as it happens a lot,
9:14
more in the teacher's notes.
9:18
[LAUGH] So let's build up our index,
so we wanna have study_minutes,
9:20
Where the study_minutes,
9:26
Are < 60 & study_minutes > 0,
9:31
right, that's what we're looking for.
9:35
But we wanna take caution to make sure
that we're careful about the order of
9:41
operations.
9:44
This & here is stronger
than the less than.
9:46
So what we're going to get is 60 and
minutes.
9:49
And again, we're gonna run into
the truthy problem that we saw before.
9:52
So, we don't want that.
9:57
So let's put parenthesis in place to
just to make sure we've got the order
9:58
of operations correct.
10:02
And voila, there we have it.
10:08
A brand new array containing entries
that represent values from our
10:11
study_minutes array,
that are less than 60 and greater than 0.
10:16
That's pretty cool, right?
10:21
And you can see, you can pretty
much read that more or less, right?
10:22
You'll get used to remembering
to use the parens and
10:25
ampersand, but
I guarantee you'll forget sometimes.
10:28
Now, one thing we really
should consider is this.
10:32
Even though we did those minutes, these
are minutes here that we spent some time.
10:36
They don't actually count for
completing the challenge.
10:40
The challenge is to do
at least an hour a day.
10:42
So in reality,
we really should set all of these to zero.
10:45
If deleting these minutes doesn't
motivate me, I don't know what will,
10:51
especially this 58 minutes.
10:55
Now even though this index statement,
this study_minutes,
10:59
Study_minutes, < 60,
11:06
now even though that
creates a brand new array,
11:09
if you assign to it, you can do an update.
11:14
And if we look now,
we look at our third row there,
11:19
we'll see that we add some zeros
in where they were not before.
11:24
You guys look at those,
11:30
all that time didn't count
because I didn't reach that hour.
11:31
No, now of course that
time did actually count.
11:35
I was learning, but
it didn't count towards the challenge.
11:39
And I'll tell you what, this 100 days
of code challenge totally motivates me.
11:43
So losing that time definitely will
keep me focused in the future.
11:47
It reminds me that I just
need to stick with it,
11:50
I want to complete this challenge.
11:53
Speaking of challenges, I'd like to again
challenge you to capture your thoughts
11:56
on Boolean array indexing
in your notebook.
12:00
Remember to think through the possible
gotchas that we walked through.
12:03
Like, accidentally using the and
keyword or forgetting to use parentheses?
12:06
If you've ever done SQL programming
before, that might have felt familiar.
12:10
Capture those thoughts a bit.
12:14
Also, now is a good time to take
a moment and review your notebook.
12:16
Is everything in there clear?
12:20
If not, please hit up the community and
ask your questions.
12:21
If you are looking to
solidify your knowledge,
12:25
I highly recommend attempting
to answer some else's questions.
12:27
I can't recommend it enough,
by taking the time to explain a concept,
12:31
you will uncover new knowledge.
12:35
Give it a shoot and won't disappoint.
12:37
So far what we've been doing
is returning a new array.
12:39
But you can actually return a view
of the data that you can manipulate.
12:43
Let's take a look at data views and
some more powerful slicing features next.
12:46
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up