Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialJeremy Hall
2,048 PointsHow big of a list in Real World Applications would be considered too much?
I understand Python is inherently not a database-specific application. However, if it's possible how big could it be until it starts to affect the application?
1 Answer
Jeff Muday
Treehouse Moderator 28,724 PointsThat's a great question!
A list is a native in-memory structure-- thus, a list can grow until your program runs out of memory. On my desktop computer with an IDE and 6 browser tabs in Chrome, etc., I could store 47 million integers, on the Treehouse workspace I could store 12 million integers in memory before my process was terminated. (see code below)
Fortunately, when Python cannot allocate new memory it throws an exception and returns the memory to the OS heap rather than crashing the computer! In some scenarios like the Treehouse workspace, there is a supervisor process which will kill the Python process when it exceeds permitted limits of memory storage.
Some data structures are better suited to numbers, faster, and consume less memory than the Python list-- such as a numpy array-- the Python list is built for generic objects so has more overhead per element.
It is not too difficult to design a data structure that could store even more using disk as memory. Basically you pick how many values you want in memory at one time, and then when you go above that limit, you write out the list to disk. You have two values stored internally to the class beginning and end indices, and if the program calls for an item in memory then you have it, if not, then you read the block that contains it from disk. Deleting and adding elements into the middle is where you have to start thinking about using hashing tables...
But rather than re-invent the wheel, you would want to use a good database program!
memtest.py
# a very simple memory test
# this program will test the approximate number of integers that
# can be stored in a list using a Treehouse workspace
my_array = []
i = 0
while True:
i = i + 1
if i % 10000 == 0:
# print once for every 10000 iterations
print(i)
my_array.append(i)
treehouse:~/workspace $ python memtest.py
10000
20000
30000
40000
50000
... stuff omitted ...
12260000
12270000
12280000
12290000
12300000
Killed
treehouse:~/workspace $