Let’s Implement The Open Source Model! But… Which Open Source?

Written by fagnerbrack | Published 2017/10/29
Tech Story Tags: open-source | programming | linux | agile | open-source-model

TLDRvia the TL;DR App

Given enough mouths, all terms are dubious

Listen to the audio version!

It’s very common for organizations to decide to copy somebody else's “model” for software development. People tend to watch their unreasonable success showed off at conferences or blog posts and start labeling as the “Company X model”.

Let’s implement the Spotify model!

Let’s implement the Netflix model!

Let’s implement the Facebook model!

Here’s a secret: There’s no “model” that you can simply copy and be successful. Each organization has a culture. Some aspects of that culture allow unique and successful behavioral patterns to emerge. Those patterns are strongly context sensitive: if it works for them that doesn’t mean they will work for you.

There’s a name for when you try to implement somebody else’s pattern without any understanding of why it works, it’s called Cargo Cult.

Just because a model works for an organization, that doesn’t mean it’s gonna work for yours.

The same thing also happens for Open Source.

Many organizations depend on Open Source software to do a big part of their work but almost none of them contribute to the projects they use. I’ve spoken with people who use Open Source and don’t even understand where the code comes from. As if it was a by-product of an advanced Artificial Intelligence living somewhere in the darkest corners of the internet.

Even though some Open Source projects are created by a single for-profit organization, many of them are not. Take for example curl, Linux or jQuery. Those projects depend on a lot of people who are not paid to work. Yet, software is built and sometimes with better quality and stability than their corporate counterparts.

How can software that is not created by an organization be successful?

It looks unreasonable, so…

Let’s implement the Open Source model!

However, if you ask three different people what “Open Source” means, you're very likely to receive three different responses.

The term is becoming greatly overloaded.

“Open Source” is becoming “Github”.

Github made the Git version control system accessible to everyone. You can create, edit and delete files and folders without having to type a single git command. It focused on adding a social and user-friendly aspect for Git Repositories, which is not a bad thing.

Github Repositories are located online under the namespace of the username or organization they belong to. A Fork is a copy of the original repository under somebody else’s namespace, it has a direct reference to the original repository. Collaboration is done through Issues and Pull Requests.

Before Github, Sourceforge was the most popular website to host openly accessible code. It had a small fraction of registered users (208K) compared to what Github have today (20M).

If you want to know more about the state of Github Open Source in 2017, Nadia Eghbal can explain better than me.

The term “Open Source” is essentially becoming the same as “Github”.

Before Github, Open Source was very different. Linux was a role model and people just started to create big software projects in the open.

Nobody understood Open Source.

Microsoft even tried to convince the public that Mozilla’s “Open Source” was not “American”.

A documentary called The Mozilla Story: Making the World You Want. Mozilla Chairwoman Mitchell Baker and Greylock’s (and former Mozilla CEO) John Lilly tell the story of the rise of Mozilla. The video is cut in the part when they talk about the day Microsoft tried to convince the public that Open Source was not “American”.

Microsoft’s political attempt failed and Mozilla Firefox surpassed Internet Explorer as the most used browser at the time.

In 1999, Eric S. Raymond created an essay called “The Cathedral and the Bazaar”.

The essay states that a project implements the “Bazaar” model when the code is distributed, open to the public and contributed by many people. In contrast, a project implements the “Cathedral” model when development is centralized, closed to the public and only contributed by a selected few.

Most software development projects were managed as a “Cathedral”, something that still holds true today. In Eric Raymond observations, Linux was successful because they saw the need for something different. They did the opposite of what everyone else did for software development at that time.

Linux was managed as a “Bazaar”.

Linux became successful because it was managed as a ”Bazaar” instead of a “Cathedral”.

What Eric Raymond probably didn’t realize was that his observations also witnessed the application of a principle that would later on, in 2001, be added to the Agile Manifesto: “Responding to change over following a plan”, which can also be translated to Continuous Integration and Continuous Delivery:

[…] Release early. Release often. And listen to your customers. […]

— The “Release Early, Release Often” chapter from “The Cathedral and The Bazaar” essay in Eric Raymond’s website.

In contrast to Github, Linux used (and still uses) mailing lists as the primary means for collaboration. The reason can be summarized in this post and I quote:

[…] kernel developers still use email because it is faster than any of the alternatives.

In fact, Github can’t host the Open Source Linux Kernel community for at least two reasons:

  1. Github uses Pull Request for everything, including individual contributions, while the Linux Kernel uses Pull Requests (via mailing lists) to forward the changes to an entire subsystem or synchronize code refactoring or similar cross-cutting change across different sub-projects.
  2. Linux Kernel is a monotree with multiple repositories. It can’t scale using the Github model because Github doesn’t support a Pull Request submitted to multiple repositories simultaneously, with one single discussion stream shared among them all. Github's discussion only happens in a single Pull Request.

A diagram of the Linux Kernel software development tree of repositories, taken from the slides of Greg Kroah-Hartman’s presentation: “Patches carved into stone tablets”. In the diagram, the developers sit at the bottom of the tree, each of them with their own Kernel repository clone. Patches are sent to the driver/file maintainer upstream. The driver/file maintainer sends a Pull Request to the subsystem maintainers on a third level. The subsystem maintainer sends a Pull Request to the Linus Torvalds repository or to the “linux-next” tree.

Linus Torvalds is at the top of the tree. He is what Eric Raymond calls the “Benevolent Dictator”:

[..] a benevolent-dictator organization evolves from an owner-maintainer organization as the founder attracts contributors […]

— The Cathedral & the Bazaar: Musings on Linux and Open Source by an accidental revolutionary, p. 101

The Linux Open Source model is also heavily based on meritocracy. If you code well and your patch solves an important problem or adds an important feature, somebody in the community will accept it, even if it doesn’t find its way to the mainline kernel.

Git is distributed. Github is not.

Linux uses a distributed model of development supported by Git, which is incompatible with Github.

The Open Source Initiative was formed in 1998. They are the ones who coined the term “Open Source” and created something called The Open Source Definition. It states that:

  • Non-obfuscated source code should be openly available.
  • The license shall not require a royalty or other fee to be paid.
  • The license must allow modification and derived works by somebody else.
  • The software must not be blocked by anything else such as a non-disclosure agreement.
  • The project should not discriminate against persons, groups or fields of endeavor.
  • The software should not be restricted to be part of a particular distribution.
  • The license must not restrict other software distributed along with the Open Source software.

[…] Only software licensed under an OSI-approved Open Source license should be labeled “Open Source” software.

The Open Source Initiative website.

For more information, see the annotated version of The Open Source Definition.

According to The Open Source Initiative, for something to be called “Open Source”, it needs to comply with their definition.

Many organizations tend to use Github, create namespaces and use private repositories. However, if your code is not open and allowed to be copied and distributed, it violates The Open Source Definition. Therefore, it’s very far from “Open Source”.

Many organizations have teams that are responsible to maintain a subset of projects, each of them with their own repository. However, an Open Source project tends to develop a single Benevolent Dictator who is charged to take care of the most trusted version of the project.

Many organizations start using Github because of the possibility to store their repositories in a shared and centralized environment. However, an Open Source project is distributed by nature, distribution helps to share source code knowledge among the community and reduce the Bus Factor increased by the existence of the Benevolent Dictator.

Many organizations use Github Pull Requests as the common way for an individual, pair or mob to contribute to their repositories. However, Git Pull Requests (not to be confused with Github Pull Requests) were built for the purpose to support a distributed collaborative environment. Mailing lists are used to create a discussion stream that spans across many repository owners.

In almost every organization, project managers need to report progress to business stakeholders. However, in Open Source, the developers who use the software are the stakeholders. They also have the technical expertise to understand how it works and what’s better for it.

In many organizations, software development roles are separated from Junior to Senior because need to justify different salary ranges and expectations. In Open Source, developers don’t need such discrimination. They don't intend to be paid to work. They do it out of passion. They want to make an impact, not an income.

Organizations tend to copy how Open Source works, but they don’t know if Open Source works for them.

Even though sometimes it’s not very clear what “Open Source” really means for different people, it has a clear and documented definition.

The Open Source initiative defines Open Source by the means of the license the software is distributed. Something should not be called “Open Source” unless it complies with their definition.

Before Github, Open Source was very different. Linux was the role model and over time it was forced to scale in order to manage its complex community. Today, they can’t use Github effectively because their model doesn’t fit.

Github gave birth to the popular notion of Open Source with pretty UI. It reduced the barrier of entry and allowed anybody to make their code public without a lot of effort.

To implement the “Open Source model” in your organization, you need to understand what it means. This way, you’ll be more prepared to test some of their ideas and see which of them can work in your context.

The first step is to look at what’s out there and understand why "Open Source" works the way it does.

The question is: which Open Source?

Thanks for reading. If you have some feedback, reach out to me on Twitter, Facebook or Github.


Published by HackerNoon on 2017/10/29