log(x) vs ln(x) — The curse of scientific computing

When mathematics and programing go apart

Source: https://www.youtube.com/watch?v=884dnOYOb68

In the world that routinely shifts towards complete dependence on scientific computing, even a trivial mistake can have fatal consequences (check the yesterday’s headlines about Cloudbleed). Although I cannot give you precise statistics, quite likely >90% of errors are caused by us, humans. The issue, I want to write about, is far more frightening than a spontaneous or overlooked typo.

Scientific computing and high impact factor journals are plagued with misuse and confusion between log(x) with ln(x)!

Why should anybody bother about some obscure logarithmic operations?

The established, mathematical definition:

log(10) = 2.302585093 (…)

The widespread definition used in physics, chemistry and derivative life-sciences:

log(10) = 1

ln(10) = 2.302585093 (…)

Take these numbers, multiply them by some ridiculously large integer (let’s say 10000000000000) and compare. The numerical difference between these two could easily translate into an incorrect trajectory for Space-X Falcon 9 rocket, abnormal termination of Instrument Landing System for any commercial airliner attempting CAT-3 autolanding, Tesla Model-X mistakenly speeding up and performing obstacle avoidance maneuver leading to accident, and countless other tasks, which rely on numerical optimization and intricate workings of embedded logarithmic operations.

The computation of logarithm is at the very heart of potentially every in silico problem that concerns computational physics, our dearly beloved machine learning, and all the possible permutations of numerical optimization.

The observed misuse of log(x) and ln(x) is simply INSTITUTIONALIZED and AMPLIFIED by code developers and scientific writers.

Just to be clear here! I am not willing to play a “blame game”. The problem exists and it becomes perpetuated by new generations of scientific programmers. On my side:

I have routinely observed in my students (still at University of Groningen), snippets of Python code with confused use of base-10 and natural logarithms.
I observe the same problem in the code by my scientific collaborators.
A quick review of research papers (from high impact peer reviewed journals, including PNAS) on protein electrostatics from past 20 years, clearly demonstrates issues with proper use of ln(x). This coincides perfectly with the emergence of scientific computing, and a dominant place log(x) has as a representation of natural logarithm in all standard math libraries.
I have made countless log(x) vs ln(x) mistakes myself.

Looking for discrepancies in protein electrostatics calculation code

How is the problem perpetuated?

Simply by symbolical use of log(x), which performs ln(x) according to its mathematical definition. That’s how the “obvious” thing becomes “not so obvious” mistake, which in turn may have quite bizarre numerical consequences. I have attached below the snippets of documentation from different programing languages, which are widely used in scientific computing.

NumPy v1.12 documentation:

numpy.log(x[, out]) = <ufunc ‘log’>

Natural logarithm, element-wise.

Matlab documentation:

Syntax

Y = log(X)

Description

example

[Y](https://it.mathworks.com/help/matlab/ref/log.html?requestedDomain=www.mathworks.com#outputarg_Y) = log([X](https://it.mathworks.com/help/matlab/ref/log.html?requestedDomain=www.mathworks.com#inputarg_X)) returns the natural logarithm ln(x) of each element in array X.

Mathematica documentation:

Log[z] gives the natural logarithm of z (logarithm to base)

C documentation (Tutorials point):

Description

The C library function double log(double x) returns the natural logarithm (base-e logarithm) of x.

The ISO80000–2 way of dealing with it:

ln(x) for natural logarithm

lg(x) for base-10 logarithm

What can we do about the log(x) vs ln(x) confusion?

Enforce thorough reading of documentation on our students, collaborators and peers (wishful thinking).
Simply introduce ln(x) and log(x) so mathematics,programing and life sciences development go in concert together!
Follow the guidelines disclosed in ISO8000–2 standard http://www4.ncsu.edu/~jwilson/files/mathsigns.pdf

Disclaimer:

This post has been edited to address the comments and correct the issues pointed out in https://news.ycombinator.com/item?id=13731301#13732928

Useful post? If so, make my day by clicking the heart below!

About the Author

Kamil Tamiola is the founder and the architect of Peptone — The Protein Intelligence Company. You can connect with him on AngelList, LinkedIn, Twitter, and Researchgate.