Heads up! To view this whole video, sign in with your Courses account or enroll in your free 7-day trial. Sign In Enroll
Well done!
You have completed Combining Data for Analysis!
You have completed Combining Data for Analysis!
Preview
Welcome! In this video, we'll introduce you to the concat() function and the arguments used to successfully concatenate data between two datasets.
Data files for download
Load 2019 data into pandas
billboard19 = pd.read_csv("billboard_100_2019.csv", index_col ="ID")
spotify19 = pd.read_csv("spotify_200_2019.csv", index_col ="ID")
Create DataFrames for 2019 Ariana Grande Billboard and Spotify song data
ariana_bill19 = billboard19[billboard19["Artists"].str.contains("Ariana Grande")]
ariana_spot19 = spotify19[spotify19["Artists"].str.contains("Ariana Grande")]
Concatenate 2017-2018 Ariana Grande song data with 2019 Ariana Grande song data in Billboard
ariana_bill_all = pd.concat([ariana_bill, ariana_bill19])
Concatenate 2017-2018 Ariana Grande song data with 2019 Ariana Grande song data in Spotify
ariana_spot_all = pd.concat([ariana_spot, ariana_spot19])
Additional Resources
- Pandas API: concat() function
Related Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign upRelated Discussions
Have questions about this video? Start a discussion with the community and Treehouse staff.
Sign up
Welcome, in this video,
we will combine two data frames into one,
0:00
using concatenation.
0:04
That's a big word.
0:06
But, if you're in this workshop
you're probably familiar with string
0:08
concatenation, which is the combination
of two or more strings.
0:11
In Python,
this is performed with the plus operator.
0:15
For example, print
0:19
("book" + "keeper")
0:23
output is bookkeeper.
0:28
We simply combine two small
words into a new, larger word.
0:34
It's like we added the words together.
0:38
You can even consider this a full
outer join of the two words.
0:41
Notice there is no merge
among the letters.
0:45
The end of "book" and the beginning
of "keeper" are the same letter, and
0:48
both ks are retained in the final output.
0:53
Where am I going with this?
0:56
Well, I have found more data for
us to use.
0:57
Six additional months of Billboard
100 charts and Spotify 200 charts,
1:00
I'd like to add this to my existing lists.
1:05
Billboard_100_2019.csv contains
the Billboard 100 chart data from 2019.
1:12
Spotify_200_2019.csv contains
the Spotify 200 chart data from 2019.
1:20
This new data has the same
columns as the original data,
1:27
new dates, new songs, new artists.
1:32
So we just need to stitch these rows
to the end of the existing data frame.
1:36
Let's first load this new data
into their own data frames.
1:43
Make sure you download these files
from the teacher's notes and
1:46
save them to your working folder.
1:49
I'll add 19 to the end to differentiate
them from the original datasets.
1:51
Billboard19 =
2:00
pd.read_csv("Billboard_100_2019.csv",
2:03
index_col="ID").
2:16
Spotify19 =
2:26
pd.read_csv("Spotify_200_2019.csv",
2:29
index_ col="ID".
2:40
And let's isolate Ariana Grande songs, so
2:45
we can work on a smaller
portion of the data.
2:49
Ariana_bill19 = billboard19[billboard
2:55
19["Artists"].str.contains("Ariana
3:05
Grande")].
3:16
Ariana_spot19 =
3:25
spotify19[spotify19["Artists"].str.contai-
ns("Ariana
3:29
Grande")].
3:44
Let's take a peek at the top of
the original Billboard dataset.
3:50
Ariana_bill.head Then the new dataset.
3:59
Ariana_bill19.head.
4:13
Great, they have the same headings.
4:20
Let's see the shape of the original.
4:27
Ariana_bill.shape, and the new.
4:31
Ariana_bill19.shape.
4:38
What I wanna do here is
add the 101 new rows,
4:43
to the 102 existing rows
in my original dataset.
4:46
In this case, I want to concatenate
the two data frames into a new data frame.
4:51
Let's talk about the concat function.
4:57
I appreciate the abbreviation here.
5:00
It has one required argument,
5:02
a Python list of objects in
the order we wish to connect them.
5:04
By default, it performs an outer join
along the row axis, which is what we want.
5:08
Because the existing data and the new
data have the same column headings,
5:15
we don't need any optional
arguments in this case.
5:18
So let's start the concatenation.
5:22
Ariana_bill_all = pd.concat([ariana_bill,
5:28
ariana_bill19]).
5:38
Let's check the dimensions,
ariana_bill_all.shape.
5:46
So this was my expectation.
5:56
We added the first dataset
to the second dataset.
5:57
The new dataset has 102 plus 101 or
203 rows.
6:01
They both have the same 7 columns, so
the new dataset also has 7 columns.
6:08
Now we'll do the same for
the Spotify data.
6:15
Let's make sure they
have the same headings.
6:17
The original, ariana_spot.head()
6:21
ariana_spot19.head Same headings,
6:35
let's check the shape, ariana_spot.shape.
6:45
And new ariana_spot19.shape.
6:54
Let's concatenate the Spotify data.
7:03
ariana_spot_all
7:05
= pd.concat([ariana_spot,
7:10
ariana_spot19]).
7:18
And let's check its dimensions.
7:28
Ariana_spot_all.shape, the new
7:31
dataset has 186 plus 196.
7:37
That's 382 records.
7:44
They both have the same 5 columns, so
the new data frame also has 5 columns.
7:46
The concat method has optional arguments,
although we didn't need any for our data,
7:51
but make sure to check out
the teacher's notes for more info.
7:55
I have another challenge for you.
7:59
We concatenated the new Ariana Grande
Billboard data to her existing data.
8:02
We started with 24 months of data,
we now have 30 months of data.
8:07
We did the same for her Spotify data.
8:11
I would like to use the same method
demonstrated in this video to concatenate
8:13
a full set of existing Billboard data,
with the new Billboard dataset.
8:18
Do the same for
the full set of existing Spotify data,
8:23
with the new Spotify dataset.
8:26
Call your new data frames billboard_all,
and spotify_all.
8:29
In the next video,
I'll show you my solution.
8:35
See you there.
8:37
You need to sign up for Treehouse in order to download course files.
Sign upYou need to sign up for Treehouse in order to set up Workspace
Sign up