In this article, we will delve into how Python’s star operator actually works. In doing so, you will understand some of the core inner workings of the language and, in the process, become a better programmer and Pythonista.
The star or asterisk operator (*
) can be used for more than just multiplication in Python. Using it appropriately can make your code cleaner and more idiomatic.
For the sake of completeness, I’ll get multiplication out of the way. The simplest example is multiplying two numbers:
>>> 5 * 5
25
Beyond arithmetic, we can use the star operator to repeat characters in a string:
>>> 'a' * 3
'aaa'
>>> 'abc' * 2
'abcabc'
Or, for repeating elements in lists or tuples:
>>> [1] * 4
[1, 1, 1, 1]
>>> [1, 2] * 2
[1, 2, 1, 2]
>>> (1,) * 3
(1, 1, 1)
>>> [(1, 2)] * 3
[(1, 2), (1, 2)]
However, we should be careful with (or even avoid) repeating mutable elements (like lists). To illustrate:
>>> x = [[3, 4]] * 2
>>> print(x)
[[3, 4], [3, 4]]
So far, so good. But let’s try popping an element from the second list.
>>> x[1].pop()
4
>>> print(x)
[[3], [3]]
What?
When we repeat elements with the star operator, the different repeated elements refer to the same underlying object. This is fine when the element is immutable, as, by definition, we cannot change the element. But as we saw above, it can lead to problems for mutable elements. A better way to repeat mutable elements is list comprehension:
>>> x = [[3, 4] for _ in range(2)]
>>> x[1].pop()
4
>>> print(x)
[[3, 4], [3]]
Unpacking with the star operator is intuitive if you understand containers and iterables. Let's quickly go over those first:
for
loop falls into this category. Thus, lists, tuples, dictionaries, strings, and range are all examples of iterables.
Unpacking, in simple terms, is extracting elements from an iterable into a container. Based on this definition, try to guess the output of the following snippet:
>>> x = [*[3, 5], 7]
>>> print(x)
Here, the inner iterable is a list with 3
and 5
, which is inside an outer list (container). Extracting the elements of the inner list into the outer list gives us:
>>> print(x)
[3, 5, 7]
There is nothing special about a list as an iterable. Some other examples:
>>> [1, 2, *range(4, 9), 10]
[1, 2, 4, 5, 6, 7, 8, 10]
>>> (1, *(2, *(3, *(4, 5, 6))))
(1, 2, 3, 4, 5, 6)
Note that an enclosing container must exist. For example, the following doesn't work:
>>> *[1, 2]
File "<stdin>", line 1
SyntaxError: can't use starred expression here
“Extended iterable unpacking” sounds complicated but is straightforward in practice. Suppose you wanted to write a function for extracting all but the first element of an iterable and then return the output as a list. Without using extended iterable unpacking (we'll get to that in a minute), you might write something like this:
def all_but_first(seq):
it = iter(seq)
next(it)
return [*it]
Let's test this:
>>> all_but_first(range(1, 5))
[2, 3, 4]
Perfect. Now let's use extended iterable unpacking.
def all_but_first(seq):
first, *rest = seq
return rest
Very clean! And if you test this, you'll see that this function is equivalent to the previous one.
There are even more things that *
is used for in Python, like accepting a variable number of arguments in functions (e.g., def f(*args):
). But I didn’t want to make the article overly long.
How does the same operator (*
) perform so many different functions? To understand this, we need to dig deeper into Python. Remember, everything in Python is an object.
If you’re not familiar with the object-oriented programming paradigm, then you can think of objects as entities that have properties (called attributes) and can perform actions (called methods), much like real-world objects.
Objects are created using blueprints or recipes called classes. A class also has attributes and methods. But, just as the map is not the territory, the class is not the object—a class merely describes the attributes and methods of its objects; the objects actually have attributes and can execute methods.
Given that everything is an object, we're ready to understand how the star operator works for multiplication and repeating elements.
In Python, classes have special pre-defined “double underscore” methods. The most familiar one is probably the __init__
method used to initialize objects. They are also called dunder or magic methods. They are called magic methods because they are called behind the scenes and almost never directly. For example, consider the following class:
class Doggo:
def __init__(self, name):
self.name = name
def __call__(self):
print(f"I am {self.name}.")
>>> oreo = Doggo("Oreo")
>>> kitkat = Doggo("Kit Kat")
Instantiating a Doggo
object calls the __new__
method (for creating the object) and the __init__
method (for initializing the object) behind the scenes. And __call__
is a magic method which allows me to do the following:
>>> oreo()
I am Oreo.
>>> kitkat()
I am Kit Kat.
Which is the same as
>>> oreo.__call__()
I am Oreo.
>>> kitkat.__call__()
I am Kit Kat.
Cool! And as you might have guessed, the star operator also has an underlying magic method: __mul__
. The following two are identical:
>>> 25 * 4
100
>>> (25).__mul__(4)
100
Thus, different objects display different behavior when the star operator is used on them because the underlying magic method __mul__
has different definitions in the corresponding class. For strings and lists:
>>> 'bana'.__mul__(3)
'banabanabana'
>>> [2].__mul__(4)
[2, 2, 2, 2]
While __mul__
explains the magic behind multiplication and repeating elements, it does not explain unpacking or extended iterable unpacking.
This should not be surprising because multiplication and repeating use *
as a binary operator while unpacking and extended iterable unpacking use them as a unary operator. The underlying mechanics are likely different.
Let's use Python's dis
module to break things down. It stands for "disassembler" and is used to get Python bytecode from code. The Python Glossary defines Python bytecode as "the internal representation of a Python program in the CPython interpreter." A good analogy is what assembly code is to C. You'll see what I mean.
>>> import dis
>>> dis.dis('[1, *(2, 3)]')
1 0 LOAD_CONST 0 (1)
2 BUILD_LIST 1
4 LOAD_CONST 1 ((2, 3))
6 LIST_EXTEND 1
8 RETURN_VALUE
This shows that the list [1]
is first built and it is then extended with (2, 3)
. Kind of similar to:
>>> l = [1]
>>> l.extend((2, 3))
>>> print(l)
[1, 2, 3]
This explains why we can do unpacking only inside containers—outside containers, there wouldn't be anything to extend.
As for extended iterable unpacking, there's a special bytecode instruction called UNPACK_EX
to do just that. To illustrate:
>>> dis.dis('a, *b = [1, 2, 3]')
1 0 BUILD_LIST 0
2 LOAD_CONST 0 ((1, 2, 3))
4 LIST_EXTEND 1
6 UNPACK_EX 1
8 STORE_NAME 0 (a)
10 STORE_NAME 1 (b)
12 LOAD_CONST 1 (None)
14 RETURN_VALUE
The star operator(s) offers us a doorway into the inner workings of Python. In trying to understand how it works, we learned that everything is an object in Python. We learned how these objects have special “magic” methods like __call__
and __mul__
that allow for adding behavior like calling that object (as if it were a function) or using *
to do things like multiplication or repeating. Finally, we also touched on the dis
module and Python bytecode.
If there’s one thing you take away from this article, let it be this. Diving into the mechanics of a programming language construct you already use is a great way to get better at that language.