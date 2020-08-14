The Meaning of Functions in Julia

@ tk3369 Tom Kwong Software Engineer, Architect, Author

When I first learned about the Julia programming language, there were a few things that gave me the "wat" moments. One of those surprises involves both the naming and meaning of functions.

Interestingly, my naive question triggered over 200 follow-up posts in the Julia Discourse forum. 200! That's one of my best records for motivating fellow developers! 😄

What is the issue?

Let's first take a look at a very simple example.

Suppose that I have a

CalendarApp

struct Meeting subject:: String start_time:: DateTime end_time:: DateTime end

module that contains the following code:

Then, I want to create a function that calculates the length of a meeting. Super simple, right? Let's go for it:

length(m::Meeting) = Hour(m.end_time - m.start_time)

When I code, I like a REPL-based development workflow so I can test new code quickly:

julia> covid_meeting = Meeting( "COVID Response Committee" , DateTime ( 2020 , 6 , 14 , 8 , 0 , 0 ), DateTime ( 2020 , 6 , 14 , 10 , 0 , 0 )) Meeting("COVID Response Committee", 2020-06-14T08:00:00, 2020-06-14T10:00:00) julia> println(length(covid_meeting)) 2 hours

So far so good! Now, try to use

length

julia> length([ 1 , 2 , 3 ]) ERROR: MethodError: no method matching length(::Array{Int64,1}) You may have intended to import Base.length Closest candidates are: length(::Meeting) at REPL[3]:1

function to determine the length of an array.

Wat! That's right. Here we get the exact "wat" moment. What happened to the regular

length

😵 There are two length functions!

function?

The answer is quite simple. There are actually two

length

Base

functions around. One of them is defined inmodule for which everyone is familiar with, and the other one is just defined above.

Here's my own

length

julia> length length (generic function with 1 method)

function:

Now, restart the REPL to clear things up and try again:

julia> length([ 1 , 2 , 3 ]) 3 julia> length length (generic function with 81 methods)

Now, I am able to access the original

length

length

again. You may also notice that thisfunction is attached to 81 methods.

So, how did that happen? It seems that I might have hidden the original

length

length

julia> struct Meeting subject:: String start_time:: DateTime end_time:: DateTime end julia> length(m::Meeting) = Hour(m.end_time - m.start_time) ERROR: error in method definition: function Base.length must be explicitly imported to be extended

function by defining our ownfunction earlier. Out of curiosity, I can define my own function again:

Man, now it's doing the exact opposite! It doesn't even let me define

length

function anymore!

This is the second "wat" moment for the same problem.

🤔 Did I do anything wrong?

It might worth a quick discussion here about why I did what I did. And, why I thought I was right.

First of all, I came from an object-oriented programming background. To be more precise, I had many years of experience developing in the Java language.

How would the same problem look in OOP? Well, in the object-oriented world, there is probably some kind of

Array

length

Meeting

length

my_array.length(); // invokes the length method defined in Array class my_meeting.length(); // invokes the length method defined in Meeting class

class that defines amethod. In this case, I would also define aclass with amethod. For instance:

When I call the method, there is no ambiguity. These are just two different methods from two different classes.

But wait... Didn't I just do the same thing in Julia? If I look at the signature of my

length

Meeting

Meeting

length

function, it accepts an argument of data type. So, why couldn't Julia just call my function when I pass aobject, and call the regularfunction when I pass an array?

Here is my primary misconception.

Multiple dispatch only works for a single function. What I have done above actually introduced a second

length

function, and that function is attached to a single method.

More precisely, the two

length

Base.length # 81 methods CalendarApp.length # 1 method

🐛 Here's the easy fix...

functions are defined in their own modules. Let me prefix with their respective namespaces and the number of methods:

As I want multiple dispatch to kick in, I just need to make sure that I define a new method for the

Base.length

function rather than defining my own function. This is also called. There are two ways to archive that.

Option #1 (preferred): prefix the function name with the module name.

Base.length(m::Meeting) = Hour(m.end_time - m.start_time)

Option #2: import the length function before defining it.

import Base: length length(m::Meeting) = Hour(m.end_time - m.start_time)

Now, let's start a new REPL and try again:

julia> struct Meeting subject::String start_time::DateTime end_time::DateTime end julia> Base.length(m::Meeting) = Hour(m.end_time - m.start_time) julia> length length (generic function with 82 methods)

Alright, the

length

function now has 82 methods attached.

Let's confirm its functionality.

julia> covid_meeting = Meeting( "COVID Response Committee" , DateTime(2020, 6, 14, 8, 0, 0), DateTime(2020, 6, 14, 10, 0, 0)) Meeting("COVID Response Committee", 2020-06-14T08:00:00, 2020-06-14T10:00:00) julia> length(covid_meeting) 2 hours julia> length([ 1 , 2 , 3 ]) 3

Voila! Problem solved!

📌 Wait, why do I have to do that?

There is already a simple solution once I understand how multiple dispatch works in Julia.

So, how did I trigger 200+ follow-up posts in Discourse?

The main controversy is why I have to be explicit about extending

Base.length

Base.length

length

CalendarApp.length

length

. Sincehas a name of, andhas a name of, why wouldn't Julia just automatically merge them?

The whole thread of discussion in Discourse goes about how it can be more convenient and less confusing for new Julia users when the functions can be merged automatically. I will now argue (against my original opinion in the Discourse thread) that it is a bad idea to do so.

Here is the main reason: just because two functions have the same name doesn't imply that they mean the same thing.

Every function is designed to have a specific meaning. In English, the meaning of

length

function is pretty much aligned with what one commonly know what a length is.

To be clear, I will just show the first definition from Dictionary.com:

Length (Noun): the longest extent of anything as measured from end to end.

So, the length concept refers to a measurement. As with any kind of measurement, it means that I should expect it to return a numerical value.

Hence, when anyone calls the

length

function, a number is expected to be returned.

This is literally an implicit contract.

Enforcing the same meaning for all

length

String

Meeting

methods turns out to be a very useful thing. Right off the bat, I can display a graphical user interface that shows a bar that represents a measurement. The same component works regardless of whether the object is an array, a, or a

This is also the main reason why Julia packages interoperate so well with each other!

As long as there is consistent names and meanings, we can build very powerful abstraction and interfaces. Then, everything just works with each other in harmony.

You don't buy it yet? Just take a look at the various types of Julia array implementations. These arrays can be used anywhere a regular array is accepted.

😈 Playing devil's advocate...

Now, what happens if I ignore the implicit contract and define the length of a meeting to be a string? For instance:

function Base.length(m::Meeting) if m.end_time - m.start_time > Hour( 1 ) return "Long" else return "Short" end end

Well, it's probably fine because

Meeting

is my own data type.

However, it also means that I should not let anyone else use

Meeting

length

. Why? That's because another developer will probably get very confused to experience myfunction returning a string rather than a number, and that could cause serious problems.

Remember the GUI component I talked about earlier? It's going to be so broken.

Not keeping a consistent meaning (implicit contract) for a function is a recipe for failure. It severely limits the reusability of functions.

🤓 What if I really want to use the same function name for different purpose?

If I insist that my

length

function should return a string, then I really have two options.

First, I can define my own function and not extend from

Base.length

. Second, I could choose a different name for the function.

In the first scenario, I would be able to access both

length

Base.length

CalendarApp.length

functions. The caveat is that I will have to useandinstead of the short form.

This is needed to remove the ambiguity about which function I'm referring to.

The best practice, however, is to avoid naming functions with the same name that has already been used in Base. Why?

All of the exported Base functions are automatically imported into every module with the exception of bare modules. So, you will have a conflict just like how it was described at the beginning of this post. If you develop packages, then you don't want your users to be confused about your function versus the one in Base.

Because the Base module is standard library that everyone uses, it's probably not a good idea to define a function with the same name but different meaning.

🛰️ What if the dependent module isn't Base?

Now, suppose that I am using a different module rather than Base. As an example, I'm going to pick on one of my favorite packages Distributions.jl.

A typical Julia user would do the following:

using Distributions

I do that, too, when I need to use it interactively. However, if I need to use it in my app, then I would want to import only the functions that I need into my namespace. For example, let's say I want to calculate the mean and mode of some randomly-generated data, I would do this:

using Distributions: mean, mode

This is actually quite important!

First, by bringing only known functions into my namespace, it reduces the chance of function name collision. Just take a look at the huge number of exported names by Distributions.jl.

Second, I'm making my code future-proof. Let's say I have already defined a function named

dist

dist

mean

mode

Final thoughts...

in my module. My code will still work even if Distribution.jl happens to define and export their ownin a future version. So, I don't need to worry naming conflict because I have only importedandinto my namespace.

Naming things properly is super important. Besides choosing the right word, it is also important to mean what you mean.

Over the years, I have developed a habit to ensure writing code that means what I mean. And, it's actually super simple.

Just write documentations.

In Julia, I would write a doc string for every function at the same time that I code that function. Sometimes I change the function name to match my doc string. At other times, I change the doc string to match my function name.

It is quite amazing how effective this can be. I encourage you to give that a try today!

Thank you for reading.

P.S. For more tips in writing good code in Julia, consider picking up my book Hands-on Design Patterns and Best Practices with Julia.

Lead image by Romain Vignes on Unsplash



Previously published on: https://ahsmart.com/pub/the-meaning-of-functions/

