Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialNoah Fields
13,985 PointsWhy divide by 60 and then 24 instead of 1440?
Dividing a number by 60 and then 24, is mathematically equivalent to dividing that same number by one single value: 60 times 24. In fact, not only is this exactly equal and easily found with a calculator or even a bit of pencil-and-paper work, it also reduces the number of calculations the system has to do for each section by one. This is, to my mind, objectively superior to dividing twice as is shown before, unless perhaps 60, 24, or both, were variables (and they aren't). So, for what reasons would you, or I in the future, choose to complete the calculation the way it is shown in the video as opposed to my method?
9 Answers
Steven Parker
232,176 PointsIf you're converting minutes into days, dividing separately by 60 and then 24 might make your intentions more clear to someone reading the formula.
Also, it's quite likely that internal optimizations will combine the values and actually do the math as you're suggesting for efficiency no matter how it is written.
micram1001
33,833 PointsWriting out the formulas can help. Convert minutes to days: 1 hour = 60 minutes, 1 day = 24 hours
To get days divide: x/60 then x/24
givens
7,484 PointsIt’s didactic. He’s instructing us in a way that is easy to understand.
Steven Parker
232,176 PointsBeing easy to understand is also a good coding practice.
Mohammed Ismail
7,190 PointsCan someone confirm why are converting minutes into days here?
Dave McFarland
Treehouse Teachermohammed ismail -- you have to convert to days because that's what Google Sheet uses when doing time comparisons.
givens
7,484 PointsThat’s true. However if it were in a (really) frequently used loop, I would use 1440 to minimize ops. You could make it clear in comments what you are doing.
Steven Parker
232,176 PointsAs I said in my original answer, the language itself will most likely do that for you (optimizing the value, not writing a comment!). You could run a benchmark test to be certain.
Hambone F
3,581 PointsThis thread is ancient, but there's an easy middle ground for this.
Define a constant equal to the multiplier before the actual operation, i.e:
MINUTES_PER_DAY = 60 * 24
Then inside your loop:
value_in_days = value_in_minutes / MINUTES_PER_DAY
I suppose it depends a bit on the language, but in general that should be just about as fast as the single literal division, and still retain the clarity of the code itself.
givens
7,484 PointsI ran a benchmark test. I compared one division and two divisions with 100,000,000 loops. The one division case ran in 4.34279s. The two division case ran in 4.43506s. This is 2% improvement. Google would want to save the 2% as it saves them from buying thousands of more computers for their search engine.
Steven Parker
232,176 PointsBut were you dividing two literal values in the benchmark?
givens
7,484 PointsThis is my quick code. I used int literals. Let me know if I should do something differently. There's something called big O notation. The order for the first case is O(p+2n). The order for the faster case is O(p+n), where n represents multiplication and p represents other factors like the increment and checking in the loops (which should be the same). The faster case has less operations. It could be the compiler optimizes a number of things, but I am seeing a difference here. I haven't tested statistical significance.
import time
start1 = time.time()
N = 100000000
rng = range(N)
for k in rng:
10000/24/60
stop1 = time.time()
duration1 = stop1 - start1
time_per_iter1 = duration1/N
print(duration1)
start2 = time.time()
for k in rng:
10000/1440
stop2 = time.time()
duration2 = stop2 - start2
time_per_iter2 = duration2/N
print(duration2)
Steven Parker
232,176 PointsFirst, move the assignment of "start1" after the assignment of "rng" to make the two code segments have the sane number of steps.
Then, run the test several times to account for variances in system performance. I tried it myself and found more than 10% variation between successive runs. And on some runs, the first time was shorter than the second time.
I also ran the test using variables instead of constants. The times were both significantly longer, and the variance between them was consistently over 40%.
My conclusion is that literal math is done prior to run time (as expected), but variable math is done as the statement is executed.
givens
7,484 PointsI modified the code and found that one division is faster than two divisions with 100% certainty on my computer with nothing else running. Does anything need to be modified? If not, please try on your computer so that we can generalize slightly.
import time
from scipy import stats
import numpy as np
"""
This function compares the timing for one division and two divisions with int literals.
Int literals may be handled by the interpreter in which case the timing may be the same
or very similar. According to the python 3.7 test results of this computer, the one
division case is faster than the two division case with a p-value of 100%. This
provides some evidence that python handles the two divisions separately.
$ python optim_test.py
P-VALUE:
1.0
ONE DIV:
Mean
7.01418685913086e-08
Standard Dev
8.734919694020465e-10
TWO DIVS:
Mean
9.183040618896484e-08
Standard Dev
8.187212683963087e-10
"""
def compute_mean(time_list):
"Compute mean with scipy"
return np.array(time_list).mean()
def compute_std_dev(time_list):
"Compute standard deviation with scipy"
return np.sqrt(np.array(time_list).var(ddof=1))
def compute_p_value(a, b):
"""
Towards Data Science
https://towardsdatascience.com/inferential-statistics-series-t-test-using-numpy-2718f8f9bf2f
"""
M = len(a)
a = np.array(a)
b = np.array(b)
var_a = a.var(ddof=1)
var_b = b.var(ddof=1)
s = np.sqrt((var_a + var_b) / 2)
t = (a.mean() - b.mean()) / (s * np.sqrt(2 / M))
df = 2 * M - 2
return 1 - stats.t.cdf(t, df=df)
x = 10000
M = 25
N = 1000000
rngm = range(M)
rngn = range(N)
time_per_iter_2divs = []
time_per_iter_1div = []
for _ in rngm:
start = time.time()
for _ in rngn:
x / 24 / 60
stop = time.time()
time_per_iter_2divs.append((stop - start) / N)
start = time.time()
for _ in rngn:
x / 1440
stop = time.time()
time_per_iter_1div.append((stop - start) / N)
print("P-VALUE:")
print(compute_p_value(time_per_iter_1div, time_per_iter_2divs))
print("\nONE DIV:\n")
print("Mean")
print(compute_mean(time_per_iter_1div))
# print(math.mean(time_per_iter_1div))
print("Standard Dev")
print(compute_std_dev(time_per_iter_1div))
print("\nTWO DIVS:\n")
print("Mean")
print(compute_mean(time_per_iter_2divs))
# print(math.mean(time_per_iter_2divs))
print("Standard Dev")
print(compute_std_dev(time_per_iter_2divs))
Steven Parker
232,176 PointsI had trouble installing scipi. Well, it appeared to install OK, but when I ran the program I got errors (from scipy.special._ufuncs). However, here's the per-iteration times from two consecutive runs of the previous program:
Attempt | Two divides | One divide |
---|---|---|
First run | 6.893455743789673e-08 | 6.89247965812683e-08 |
Second run | 6.30212664604187e-08 | 6.354840755462647e-08 |
I don't think the two divides are actually faster, but I do think the optimization is actually converting it into a single divide. If you get different results, it's certainly possible that your Python version doesn't perform the same optimizations. I first encountered this kind of optimization in a different language, and it was dependent on the compiler there.
I guess the bottom line is if you're not sure your system optimizes literal math, and your program will be doing rapid calculations, your approach of hand-combining the literals (with appropriate comments) is probably a good idea.
But if you know your optimization handles it, or if the program will only use the calculation infrequently, the clarity of showing the complete calculation might make it the best choice.
givens
7,484 PointsI reposted the code after changing the compute_p_value function. It may run for you now. It's interesting that it doesn't appear to be generalizing. I didn't find a case where the first run was slower than the second run like you did. It sounds like a fine summary. I wonder if the instructor Kenneth Love would know more.
Steven Parker
232,176 PointsUnfortunately for us, Kenneth has moved on to other opportunities. But perhaps one of the current instructors may comment.
Dave StSomeWhere
19,870 PointsDave StSomeWhere
19,870 PointsMaybe it is just easier for someone else (or you at a later date) to understand the purpose of the calculation