Oleksandr Kaleniuk

@okaleniuk

Going beyond the idiomatic Python

People don’t speak entirely in idioms unless they are totally off their rockers. Overusing idioms makes you seem more than self-confident, full of air, and frankly not playing with a full deck. It is fair to middling to spice your language with idioms a little bit, but build the whole speech entirely out of them is beside the point.

What I’m trying to say, stuffing your text with idioms you happen to read somewhere doesn’t automatically make it better. And it doesn’t work for your code either.

There is a book called “Writing Idiomatic Python” written by Jeff Knupp. It is a decent collection of Python idioms, mixed together with some of the less known language features and best practices. While being rather helpful as a dictionary or a bestiary, it promotes the dangerous fallacy that writing code idiomatically automatically makes it better.

You might think I’m exaggerating so here is a direct quote:

Each idiom that includes a sample shows both the idiomatic way of implementing the code as well as the “harmful” way. In many cases, the code listed as “harmful” is not harmful in the sense that writing code in that manner will cause problems. Rather, it is simply an example of how one might write the same code non-idiomatically. You may use the “harmful” examples as templates to search for in your own code. When you find code like that, replace it with the idiomatic version.
— Idiom, sir? — Idiom!

Even in that very book, however, there are examples that simply don’t work all that well. Like this one:

def contains_zero(iterable):
# 0 is "Falsy," so this works
return not all(iterable)

This is a trivial function, but due to the unwanted idiom it requires translation. If not all elements are iterable then they contain zero? That’s just nonsense! Using a standard language facility makes the whole thing trivial again.

def contains_zero(container):
return 0 in container

The container contains zero if there is a zero in the container. This is so trivial; it doesn’t even deserve to be a function, not to mention having a dedicated comment.

There is another example:

def should_raise_shields():
# "We only raise Shields when one or more giant robots attack,
# so I can just return that value..."
return number_of_evil_robots_attacking()

Raise shields when one or more giant robots attack. Ok, that makes sense. But if the logic is clear, why not wire it to the code directly?

def should_raise_shields():
return number_of_evil_robots_attacking() >= 1

It’s 3 more symbols, but they make the function absolutely transparent for a reader. There is no need for a comment anymore.

It’s tempting to think that following some rules will automatically make you a better programmer. But it doesn’t work this way. It’s not about the rules, it’s about when to follow them and when not.

Here, let me show you.

Rule 1. Use long descriptive names when necessary

Before:

def inv(a):
return adj(a) / det(a)

After:

def inverse_of(matrix):
return adjugate_of(matrix) / determinant_of(matrix)

Rule 2. Use short mnemonic names when possible

Before:

linsolve([matrix_value_b + resuting_point_x0 - resuting_point_x1*matrix_value_h - resuting_point_x1,
matrix_value_e + resuting_point_y0 - resuting_point_y1*matrix_value_h - resuting_point_y1,
matrix_value_a + resuting_point_x0 - resuting_point_x2*matrix_value_g - resuting_point_x2,
matrix_value_d + resuting_point_y0 - resuting_point_y2*matrix_value_g - resuting_point_y2],
(matrix_value_a, matrix_value_b, matrix_value_d, matrix_value_e))

After:

linsolve([b + x0 - x1*h - x1,
e + y0 - y1*h - y1,
a + x0 - x2*g - x2,
d + y0 - y2*g - y2],
(a, b, d, e))

Rule 3. Prefer decomposition to comments for clarity

Before:

def are_all_numbers_in(list_of_everything):
for element in list_of_everything:
try:
# throws ValueError if element is not a number
float(element)
except ValueError:
return False
return True

After:

def fails_on_cast_to_float(s):
try:
float(s)
return False
except ValueError:
return True
def are_all_numbers_in(list_of_everything):
for element in list_of_everything:
if fails_on_cast_to_float(element):
return False
return True

Rule 4. Don’t clarify what is clear enough already

Before:

def hash_of_file(name):
with open(name, 'r') as text:
return str(hash(text.read()))
def file_is_xml(file_name):
return file_name.endswith('.xml')

def name_belongs_to_file(file_name):
return os.path.isfile(file_name)
def name_belongs_to_xml(name):
return name_belongs_to_file(name) and file_is_xml(name)

def print_name_and_hash(name, xml_hash):
print name + ': ' + hash_of_file(file_name)

def traverse_current_directory(do_for_each):
for file_name in os.listdir('.'):
do_for_each(file_name)
def print_or_not_hash_for(file_name):
if name_belongs_to_xml( file_name ):
xml_hash = hash_of_file(file_name)
print_name_and_hash(file_name, xml_hash)
traverse_current_directory(print_or_not_hash_for)

After:

for file_name in os.listdir('.'):
if os.path.isfile( file_name ) and file_name.endswith('.xml'):
with open(file_name, 'r') as xml:
hash_of_xml = str(hash(xml.read()))
print file_name + ': ' + hash_of_xml

Rule 5. Use list comprehension to transform lists

Before:

def matrix_of_floats(matrix_of_anything):
n = len(matrix_of_anything)
n_i = len(matrix_of_anything[0])
new_matrix_of_floats = []
for i in xrange(0, n):
row = []
for j in xrange(0, n_i):
row.append(float(matrix_of_anything[i][j]))
new_matrix_of_floats.append(row)
return new_matrix_of_floats

After:

def matrix_of_floats(matrix_of_anything):
return [[float(a_ij) for a_ij in a_i]
for a_i in matrix_of_anything]

Rule 6. Just because you can do anything with list comprehensions, doesn’t mean you should

Before:

def contains_duplicate(array):
return sum([sum([1 if a_i-rot_ij else 0
for a_i, rot_ij in zip(array, rot_i)])
for rot_i in [array[i:] + array[:i]
for i in range(1, len(array))]]
) != len(array) * (len(array) - 1)

After:

def contains_duplicate(array):
for i in range(len(array)):
for j in range(i):
if array[i] == array[j]:
return True
return False

Conclusion

I hope you can see the pattern. Each even rule seemingly contradicts the odd one. But it only looks like this because they all depend on the context. That’s how it is. Writing good code is not about the rules, if it were, we would automate it long ago. It’s all about the context.

Unfortunately, this means that you can’t learn to write good code only by following the rules. It’s tempting to think so, but it just doesn’t work this way. It is of course good to know idioms and best practices, but it never ends there. You have to go beyond the idioms. Beyond the rules.

As far as I’m concerned, there is only one sure way to improve your coding skills, but it’s so straightforward and unappealing, no one would write a book on it. No one would promote it on a conference. There are, however, seldom blog posts on the topic, but they largely go unnoticed.

So here it is. The one working way.

To learn to write good code you have to write a shit-metric-ton of bad code.

And that’s it. Do wrong. Make mistakes. Learn from them. You are a programmer, not a surgeon or a race car driver. You can afford to practice on your own mistakes and not to kill anyone!

This is a privilege, enjoy it.

More by Oleksandr Kaleniuk

Topics of interest

More Related Stories