Skip to content

Latest commit

 

History

History
133 lines (85 loc) · 3.27 KB

15.md

File metadata and controls

133 lines (85 loc) · 3.27 KB

pycobytes[15] := Why is range() so fast?

Computers are fast. Developers keep them slow.

Hey pips!

The range() function is one of the most useful built-in Python functions. There are actually 3 ways you can use it!

range(stop)

The above creates a sequence of integers from 0 up to stop, but not including stop:

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Tip

This is an exclusive range, which is why way back in Python 2 this function was called xrange()!

We can also specify the number to start from, instead of 0:

range(start, stop)

This creates a sequence starting at start, and up to (but not including) stop:

>>> list(range(3, 10))
[3, 4, 5, 6, 7, 8, 9]

One of the reasons it’s quite useful for the range to be exclusive is because it’s immediately clear how many items are in the sequence: just do stop - start. No need for off-by-1 errors.

And in its final form, we can jump over numbers by a particular step:

range(start, stop, step)

This will increment by step between each number:

>>> list(range(10, 100, 15)
[10, 25, 40, 55, 70, 85]

With range(), we can build whatever arithmetic progressions (sequences with a constant difference) we like.

Ok, so what happens if we throw really massive numbers at it?

>>> timeit.timeit("range(9999)")
0.21089370001573116

>>> timeit.timeit("range(99999999)")
0.19048499999917112

Crazily, it seems to still handle it perfectly fine – even though the range has 99999999 elements! How is it doing that?

The secret is, range() does not return a list like you might expect:

>>> range(2000)
range(0, 2000)

>>> type(range(2000))
<class 'range'>

It’s actually a special range object which has start, stop and step attributes. Instead of storing every individual number, it computes them on the fly using those 3 attributes.

For instance, when we try to access the 61st element:

>>> r = range(0, 2000, 2)

>>> r[60]  # remember that sequences are 0-indexed
120

Python can easily determine it should be 120 – we’ve started at 0, and taken 60 steps of 2. So 0 + 60*2 == 120.

More generally, the element at index is given by

range[index] == start + index*step

By computing elements dynamically instead of storing them, a LOT of storage space is saved. This concept of using an ‘accessor’ object, which provides dynamic access to values instead of physically storing them, is a concept we’ll encounter all throughout Python.

Because of this, you don’t need to worry about the numbers you put into a range. It's also why we can use it for long iterations with no performance impact:

for i in range(10 ** 10):
    print("sup")

Watch out for more objects like this in future!


Further Reading

  • Checking if a range contains a particular value is super-fast too! – RealPython