Developers around the world commit changes every day. Does that mean this small part of every development process might have a big impact? Can we get for some reasonable effort cleaner history, auto-versioning for artifacts, and simplified code reviews of complex features?
In this article, I will reveal the principles behind The Ultimate Commit to help you improve your development process and drastically reduce the complexity of collaborative development.
There a different systems for version control systems, but I’d like to emphasize that these thoughts are based on my experience with git-based systems and not validated against other VCSs.
Rationally speaking, it must objectively benefit us if we apply some effort to make something “better.” If we improve our approach to commit, we must utilize these results. Otherwise, it doesn’t make sense. What utility might we achieve?
Program comprehension takes
We could replace enormous effort during code review and code comprehension overall with a relatively small effort on every change. Eventually, engineers don’t get hurt by this routine activity and keep being motivated.
Some well-known research papers say that after 200-400 LOC bug detection density is decreasing dramatically. At the same time, keeping features within such a hard limit is impossible. There should be something more atomic. Randomly reviewed files without a limited scope, mixed refactorings, and feature code might make your code review useless.
A more structured approach makes the reviewer more attentive, which helps to find more problems early and saves the team’s time in total.
Code must be self-documented with clear variable names that explain what is inside and accurate method names that do not force you to look inside because it is obvious from the method name.
That still makes sense to write comments for not obvious fixes that should not be occasionally reverted, for example, explicitly redefined transitive dependency version.
But sometimes, we do not understand why some code was written in a particular way instead of an alternative one, we open history and see… nothing.
The only “code review fix” or squashed “XX-1234 My amazing feature” ticket name in summary. History doesn’t really help developers to maintain existing code, and this might be improved.
The Ultimate Commit should give you more context of every particular change, simplify long-term work, help you to return later, and introduce some changes with minimal side-effects of missed memory pieces.
The meaning of every change is important; this is a consistent log of code mutations that should help you to automate versioning. Because a set of commits is a set of changes, we might leverage it.
I don’t call it “Release Notes” but rather “Change Notes” because developers help themself instead of end-users when they write commit messages. The audience who will get the most profit from reading notes inferred from commits.
We clarified what might be improved with “the Ultimate Commit”, so let’s try to make obvious properties of one.
Keeping complex structures in mind during code review and history analysis is challenging. Small limited changes are much simpler to understand. Just imagine that you review refactoring results and business code separately.
Eventually, it is the same code, but review pieces are logically separated.
You can quickly look through refactorings or 100+ files of style changes and deeply review business ones that will impact your ability as a reviewer to focus on things that matter and keep high bug detection density.
This is obvious nowadays, but I must mention that the readability of commit messages directly impacts history analysis and program comprehension activities.
If you save 1 minute instead of properly formulating a commit message today, this will chase you for the entire project lifetime until you leave the company.
I often saw commits with the summary like “fix code review comments”, sometimes even repeated many times. These commits have meaningless summaries because they do not explain why (even what) changes were introduced. You might object to me: but changes were introduced because the code reviewer added a comment. This is one of the most popular mistakes I’ve seen.
Code Review is a tool for feedback but not a cause. The key task you solve on code review is the direction of attention to particular issues. Emphasize issues early to fix them before the root cause is merged.
Expressive commit messages help you to formulate and check that you properly understand the motivation behind the review comment. Is this just a style issue or a performance bug?
There is no need to duplicate information. A summary of the Ultimate Commit should give you more information than you read from the code inside. For example, “add if statement” repeats the content, but “handle a corner case” explains why these changes are needed.
There might be a few types of logical changes, so if we define the taxonomy, we could build automation for semantic versioning, for example:
Efficient committing does not correlate with time; there is no sense to commit every hour or by the end of the day. At the same time, if a change takes more time than your development session or consists of a few changes, you could likely decompose it. For example, “introduce REST API with stubs under the hood”, and “improve resilience with retries for network calls.”
This might be inefficient to scan some changes by eyes formulated in a passive voice, and others in an active one. Some have information about the initial branch/task, others do not, or this information is added differently.
There should be one standard for every particular commit.
Small, Readable, Expressive, Normalized, Structured, Completed, and Unified. WOW! A lot of properties to satisfy, some of them fully on the developer’s shoulders, but others might be controlled independently.
Let’s see what practices can help us to satisfy the properties above.
I hope that some of you (who are already aware of Conventional Commits) recognize that this specification covers the major part of the Ultimate Commit properties. For others, let me explain
According to the site:
The Conventional Commits specification is a lightweight convention on top of commit messages. It provides an easy set of rules for creating an explicit commit history; which makes it easier to write automated tools on top of. This convention dovetails with
SemVer , by describing the features, fixes, and breaking changes made in commit messages.
If a team follows the convention, their commits get the following properties of the Ultimate Commit: small (atomic), structured, completed, and partially unified.
Conventional Commits propose many standard types of commits, such as feat, fix, chore, refactor, docs, style, test, perf (performance), ci, build, and revert. I highly encourage you to read more about that practice on their site.
Usually, there is a natural intention to write a commit summary in a passive voice, for example, “user API was introduced.” The problem with this approach is that passive voice is more complex to read because verbs might drastically change from their infinitive form; otherwise, the infinitive form in the present tense makes it simpler.
Another motivation is that commit messages will be unified with Merge commits which start from “Merge …”
This practice helps to unify and improve the readability of commits.
If you start with conventional commits, you must stop squashing your commits. Now your history has been filled with relevant information you might use later, do not lose it.
Commits in history become readable and expressive. They are small, unified, and structured so you can easily understand why some changes were introduced.
Identifying branches via ticket code makes everything simpler. First of all, you don’t need to think about the branch name at all. Secondly, you might add a branch name to the footer of a conventional commit, and if later you want to find all commits related to ticket XYZ-1234, you just search this code in the git history. If the branch were named after the ticket, you would save a lot of time.
It is obvious if you wanted to have exclusions from this rule, sometimes you can, but keep in mind that it breaks your development traceability.
What is the best? Branch name addition is easy to automate but amend commits might be tricky. So, click “star” on this gist to not lose the advanced git hook to add a branch to the commit message.
Your commit diff already explains what was changed. You could add more information with your commit message to explain “why.” Use commit history as a chat for communication with new developers (or yourself in a year) who do not understand why a particular line was added so they will read it and not bother you repeatedly. I gave a few examples above.
IDE plugins and command line tools might help you to structure your messages according to specifications.
Git Hooks might add a branch name to the end of every commit. There are several ways to implement git hooks, from manual git config update to build system plugin usage (
IDE plugins also might help you to validate your commit messages.
Git Hooks can help to test commit messages against specifications and other requirements.
Git Server Hooks help to protect from occasionally violated rules.
Semantic Release is a great tool to compute the next semantic version of your artifact and publish it to the git server. Commitizen is also a popular one.
The most of conventional commit-based tools are mentioned
The Ultimate Commits are not perfect because they require some effort, but eventually, you improve your code review, code comprehension, and, I’m not afraid to say, the culture of your development. Discuss the practices within your team, and get the best ones for your case.