A few weeks back, I was invited to talk about concurrency at a local university. This write up summarizes what I presented there.
We all learn about parallelism/concurrency during the university or in other ways. Any one who is learning how to program will inevitably read/learn about fundamental concepts such as,
Even though concurrency is a mainstream still we find it difficult to deal with. Why?
I started digging into this issue in particular due to my past experiences and obviously I’m not the only one who is suffering. There are lots of literature on the internet on how this can be solved. Here’s a summarized list, which I think helpful to keep in mind:
Rule #1 : Global Mutable State is Bad ! (With or With-out Concurrency)
Note: Anything is mutable if its state changes over time, immutable otherwise.
E.g. Global static variables, static classes with public instance variables, singleton objects, environment variables, configuration objects, amount of data/state shared with in a clustered application.
As the application grows in size and complexity (think multiple moving parts within the program and the program runs in clustered environments), having lots of global mutable state makes it impossible to predict the program’s next state at a given time, especially if you are in troubleshooting mode. To makes things even worst you will not have control of state changes.
Now I’m not going to be an idealist and say we shall have a zero global state! Because it’s not practical. But it’s reasonable to say we need keep it to a minimum.
Rule #2: Use an Application Framework — Adhere to It’s Programming Model.
Most of the code we write (=~ 90%?) is about moving data around such as, [Get Data → Do something → Show in the GUI or put into storage]. What this means is we don’t deal with and in-fact don’t have to deal with concurrency directly. This is unless we are writing some special purpose code such as writing a framework/server-run-time of our own.
Every programming language and surrounding ecosystem provide higher level concurrency control constructs, therefore most of us don’t need to create threads manually (synchronized blocks / locks etc.). Most of these paradigms provide simple, component based programming models (e.g. Beans, Servlets, Controllers in J2EE/Spring) allowing application programmers to focus on business logic than boilerplate code.
Therefore, every time we feel like introducing a thread/synchronized block it’s always a good idea to take a step back and rationalize the need.
Rule #3 — Best Practices
Most of these one are just obvious and applies in a broader sense of software engineering. I don’t mean to provide an exhaustive list. But please comment, if I missed anything that needs to go here!
Data access patterns (e.g. Compute Heavy / IO Heavy)
Deployment ( e.g. VM environments, containers, industrial computers) Hardware/Resource constrains