paint-brush
How the Creators of Git Do Branchingby@rocketraman
5,411 reads
5,411 reads

How the Creators of Git Do Branching

by Raman GuptaMarch 23rd, 2022
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

And another showing a linear history of the same project that uses fast-forward merges as Adam recommends:

People Mentioned

Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - How the Creators of Git Do Branching
Raman Gupta HackerNoon profile picture

There are a few common branching and process models for distributed development with Git on larger teams and projects.

GitFlow, created by Vincent Drieesen, is one such common branching model. GitFlow is relatively simple to understand, was excellently presented and illustrated by Vincent, and seems to properly leverage Git’s excellent branching and merging capabilities. Adam Ruka wrote a popular polemic called GitFlow considered harmful, and presented his own oft-used model.

I submit for your consideration another approach for larger teams and complex projects with periodic releases: the dogfood workflow used by the git.git project — the open source project that develops and maintains git itself. This workflow is not new — I contributed to the gitworkflows(7)  man page above in 2009, and it was first created by Thomas Rast in 2008. But it hasn’t gained a lot of adoption outside of the git.git project itself. I think the main reasons are that it a) doesn’t have an easy-to-google and easy to remember name like “GitFlow”, b) creates a seriously crazy-looking history graph, and c) is a bit harder to grok, especially when one is coming from a background of tools that prefer mostly linear history like subversion. Lets tackle each in turn.

To give it a name, let’s call it “gitworkflow”, in deference to its creators and the man page in which it is described. A quick disclaimer: I am not a regular git.git contributor — and this is not in any way endorsed by the git.git project (though a few nice people from the git.git mailing list did review a draft), and any mistakes, inaccuracies, or misunderstandings in this article are mine and mine alone.

. . .

On to the second point: in Adam’s post, he shows two screenshots, one showing a non-linear history with GitFlow as an example (the key point relating to the use of 

--no-ff
 merges to create empty merge commits):

And another showing a linear history of the same project that uses fast-forward merges as Adam recommends:

and asks:

Which history would you rather be faced with when investigating this problem?

The latter history graph is seductively easy to grok, but… under the covers a history with merge commits provides much value. It keeps related work together, while the linear history intersperses related work among several non-continuous commits, with the only way to identify the relationships being the issue id in the commit log. The linear history is easy to view, but hard to use — in Rich Hickey’s terms, it is “easy” but not “simple” because it introduces accidental complexity.

However, the fact that the first history graph Adam illustrated is actually hard to understand is true and needs to be addressed. Here, Git’s capabilities come to the rescue. Want to see all the merges done to a branch without the details of the individual commits? Use

--first-parent
(as pointed out by the most popular comment on Adam’s post). Here is git.git’s history of the master branch without that flag (yikes):

git.git log master

but with

--first-parent
:

git.git log master with

--first-parent

Nice! Now compare the second screenshot to Adam’s linear history, and ask his question again. With this approach, each individual topic worked on is listed consistently, the details of each are elided, and the history is presented linearly. But at the same time, the detailed history of each topic branch is still there behind the scenes. For example, let’s look at the history of merged branch

jk/snprintf-cleanups
, using an alias called
mergedtopiclg
:

And now we see all of the commits related to that specific merge, again linearly, and without unrelated commits interspersed.

. . .

Now, the last and most important, reason why gitworkflow isn’t widely adopted: it’s hard to wrap your mind around it, or at least it was for me. Adam states:

GitFlow advocates having two eternal branches — master and develop. Why two, when one is the conventional standard?

In contrast, Gitworkflows actually has four such “important” branches, though of course only one of the four is “eternal”: maint, master, next, and pu. Exactly why do I need all of these branches and what the f*k is pu? The “Managing Branches” section of the gitworkflows(7) man page, and these sections of MaintNotes and maintain-git.txt, are all worth reading again. And again (they may make more sense after finishing this article).

Consider that, if we are maintaining releases of a project, we probably have different “versions” of it at varying levels of stability. A stable “release” (or “production”) version, a beta (or “testing”) version for ambitious users and testers which is somewhat stable, and an alpha (or “development”) version that contains all the latest goodies, but in which there is probably a bunch of broken stuff as well.

Not coincidentally, our gitworkflow branches master, next, and pu (proposed updates) correspond to these release/beta/alpha levels of stability respectively. The remaining branch maint is branched off of master at release points, and simply contains hot fixes to already released code.

Now that we understand the purpose of each “integration branch” in terms of stability, we combine this with the concept of “topic branches”. Topic branches themselves are non-controversial — both GitFlow and Adam recommend using them. Topic branches are where all the current work is being done — one branch per issue, bug, or feature, and there can be many topic branches undergoing development at once. In gitworkflow, however, topic branches are actually fundamental to the flow as opposed to just being a temporary place to develop code in isolation.

Topic branches always start from either maint or master. Branch from maint if the topic eventually needs merging to the current release represented by maint, or master if the topic is intended for a future release (not necessarily the next one). Note one caveat: if a new topic topic-b depends on code written on an earlier topic topic-a, then topic-b should still branch from maint or master, but topic-a should be merged into topic-b. This keeps topic-b logically separate from topic-a, but makes the dependency explicit.

Since topics branch from maint or master, they can also be merged into next and pu without any hassle at all — since both of those branches share a common ancestor with the topic in the form of a commit on master (or maint, which master always always includes). Most branch strategies wrongly assume that one can only merge into the same branch one forked from, but Git is smarter and more flexible than that.

Topic branches start out in an unstable state, and probably contain errors and unfinished work. However, this shouldn’t stop us from integrating our work with other work in progress to see what happens, both at merge time and at run time — this process is represented by our alpha-state code in the pu, or proposed updates, branch. Periodically pu is reset to the current master, and then all the topic branches are merged into it to identify merge issues, run tests, and to produce alpha-state build artifacts. Note the order of the merges does not really matter, but can be defined on a project or case-by-case basis. Performing the merges in a consistent order will best allow useful tools like

git rerere
to work correctly, and can be semi-automated via scripts.

Merging to pu is extremely useful for initial feedback on development work from product owners, functional testers, and keen users. GitFlow does not offer any inherent way to do this — CI has become shorthand for running tests on individual topic branches automatically, but at its core, it’s the capability to merge and test all current work in progress that is true CI and that gitworkflow enables.

Finally, note another useful attribute of gitworkflow: pu is well understood to be a throw-away branch that may be rebased at any time, and therefore no commits should generally be based on it. Therefore, despite past merges to pu, topics can still be interactively rebased as much as is necessary to produce a great, easily reviewable, and understandable series of commits, without worrying about other people having to “recover from upstream rebase” (an easy process but not one that should often be needed).

After a topic branch has undergone several cycles of refinement, code review, and testing, it may reach the point (in someone’s judgement) that it is good enough to be released to a beta (or user acceptance test) environment. At this point, in gitworkflow, the topic “graduates” to the next branch. Graduation simply consists of merging the topic to next with

--no-ff
.

Now that the topic has graduated to next, it can be part of a beta, or acceptance release. So every topic on next can now undergo a second round of stabilization, which is exactly the purpose of a beta release / acceptance testing environment. However, note that with gitworkflow, we still have not committed (no pun intended!) to having this topic as part of our next release to production — it still has not been merged to master. This is similar in concept to GitFlow’s release branch, but far more flexible and powerful, since master has no dependencies on next whatsoever, nor is next ever merged wholesale into master (unlike the corresponding GitFlow branches develop and release).

What if topic branches continue to evolve with new commits after merging and testing on next? The branch is simply merged to next again, as is necessary. Optionally, these further changes can also undergo an extra stabilization period by merging them to pu first. Since next is an integration branch with a finite lifetime, repeated merges of topics to it do complicate it’s history, but only until the next rebuild of next (which may happen at the next release, but gitworkflow is flexible here). Irrelevant and distracting history here, and on pu, is eventually dropped and forgotten.

And lastly, once a topic is judged stable enough to release, the topic graduates again and is merged to master (or perhaps maint), again with

--no-ff
to preserve the complete history of the topic branch.

A couple of topic branches merged to next, tested there as beta-releases, and then finally graduating to master for release could look something like this:

Source: http://jsfiddle.net/jtooun5q/5/

Note that in gitworkflow, unstable and stable development work is never mixed together on the same branch. In contrast, with GitFlow I have two choices: 1) I can test my topic in isolation on its own branch, or 2) I can merge it to develop for testing. Neither choice is appealing. The former doesn’t offer a true test of a topic’s stability when deployed together with other ongoing work, and the latter commits the topic to develop perhaps before it is stable.

For the same reason, GitFlow has no process-assisted way to identify in advance which topics may cause conflicts with other work in progress, so that the developers in question can coordinate efforts before that big final merge of their topic to develop. In short, in GitFlow there is always an unsolvable tension between the desire to keep development work clean and isolated on a topic branch, and integrating topic branches with other work by merging them to develop to make them visible and testable and to check for conflicts. Gitworkflow allows both goals to be achieved without sacrificing one for the other.

. . .

There are additional capabilities and more to learn about gitworkflow than I have covered here. Perhaps some followup posts by myself or others can cover this ground. In addition, an open source toolset around gitworkflow to help with things like rebuilding pu and next, and tracking the graduation status of topics would be useful — git.git has its “cooking” tools for this purpose, but tooling that is less git.git specific would be nice. As a start, I have created a few useful aliases[1]. UPDATE Apr 8/2018: I’ve also created a repository on GitHub with additional documentation, including a task-oriented primer.

But in short, gitworkflow is an excellent Git branching model / workflow for many use cases, and deserves to be more popular than it is. It does require a strong understanding of advanced Git features and concepts, but the effort is well worth it.

[1] See especially the aliases

lgp
,
topics
,
topiclg
,
mergedtopiclg
,
branchnote
, and
where

. . .

Raman Gupta is the VP Engineering at reDock Inc., and loves teaching the wonders of Git. BTW, reDock is hiring! Follow us on Twitter @reDockAI.