Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialRichard Santana
5,305 Pointswhat is scraping ??
he keeps referencing things we haven't learned yet and gives no information on where we can learn this information
3 Answers
Simon Coates
28,694 Pointsscraping refers to pulling information off a web page that isn't meant to be an API. For example, if it write software to get emails from a web page, i might describe this as 'scraping' the website for emails. see https://en.wikipedia.org/wiki/Web_scraping
Collin Berg
33,471 PointsRichard Santana an example of an API is the JSON feed for the profile that we've been working with. The JSON feed is organized, and structured in a way that it makes it easy to crawl through the data and pull information out that we want.
On the other hand, an HTML page has tons of extra information, it has css classes, html elements, javascript elements that get in the way that make it harder to scrape the information from. Since the information isn't organized like the JSON, we can easily access it so we have to "scrape" it which is slower and sort of like a crawling over the page, bit by bit, til we find what we're looking for.
Richard Santana
5,305 Pointswhat do you mean " isn't meant to be an API" ? what is an api?
Simon Coates
28,694 Pointssorry, that wasn't helpful. An API is an application programming interface. It's when something is meant to be serve up information for consumption by code (when you DELIBERATELY expose functionality). A web scraper will typically extract information that wasn't intended to be accessed that way. Most websites are meant for human consumption.