237 reads

Software Version Control for N00BS

by Cleuton SampaioFebruary 14th, 2021

Too Long; Didn't Read

Configuration management (CM) is a software engineering process for maintaining consistency of a product's performance, functional, and physical attributes with its requirements, design, and operational information throughout its life. The use of a software configuration management tool does not guarantee that the configuration management process is being performed. The main objective of an SCM process is to maintain traceability of the source code, according to the management of changes made to the software. Every SCM tool has the ability to duplicate objects, allowing different versions of them to coexist.

People Mentioned

Companies Mentioned

featured image - Software Version Control for N00BS

Configuration management

Configuration management (CM) is a software engineering process for establishing and maintaining consistency of a product's performance, functional, and physical attributes with its requirements, design, and operational information throughout its life.

What many are unaware of is that software version management also falls under configuration management.

Software configuration management is a complex and difficult issue to implement, especially in large teams, where programmers do not document anything and the entire process is done "ad hoc".

ad hoc

adjective/ˌæd ˈhɑk/ , /ˌæd ˈhoʊk/ (from Latin) arranged or happening when necessary and not planned in advancean ad hoc meeting to deal with the problemThe meetings will be held on an ad hoc basis. (Oxford).

Objectives (or Why Should I Care?)

Ok, imagine the scenario in which it is necessary to generate a new deliverable version of the software. How is this done in an "ad hoc" manner? There may even be a source code repository, but there is often a programmer who has the latest tested and approved version on his workstation. They then builds the package and send it to the production environment.

Now, imagine that two or more programmers have worked on this same version of the software, which will be deployed at this time. What should be done? First, you need to make sure that the work of everyone involved is reflected in the source code that will be compiled to generate the executable package. How is it possible to be sure of this? And which developer has the most complete version? And have all the changes been tested separately and together?

The use of a software configuration management tool does not guarantee that the configuration management process is being performed.

One of the most common uses of SCM tools (Software Configuration Management tools, like: Git, for example) is backup! Yes, the source code is "stored" in the SCM repository and, if something happens to the developer's workstation, at least the code is saved.

Configuration management involves more than meets the eye.

The main objective of an SCM process is to maintain the traceability of the source code, according to the management of changes made to the software. In the end, it provides a single source of reliable versions of the software.

Through SCM it is possible to have several versions of the software, being worked on by different people and, still, be able to build an executable with the changes we want.

Branch

I do not want to attach myself to a specific SCM tool, because the concepts explained here are valid for all. But I will give you some examples about Git, which is open source and widely used today.

Every SCM tool has the ability to duplicate objects, allowing different versions of them to coexist. A specific set of objects, with a given version of code, can exist as a "branch".

Branching, in version control and software configuration management, is the duplication of an object under version control (such as a source code file or a directory tree). Each object can thereafter be modified separately and in parallel so that the objects become different. In this context the objects are called branches. The users of the version control system can branch any branch. (Wikipedia)

We can work on a branch of the source code and then push it into the central repository. Objects on one branch are never mixed with the same objects on other branches.

A branch contains a specific version of the objects, allowing us to work on them independently of other versions of the same objects

Master

Although there are problems with the terms "master" and "slave", we use the term "master" here only for its context in SCM.

Every branch has a parent, from which it was copied. But there is a special branch that has no parent, and is known as MASTER (or Trunk in other tools).

Branches are also known as trees, streams or codelines. The originating branch is sometimes called the parent branch, the upstream branch (or simply upstream, especially if the branches are maintained by different organizations or individuals), or the backing stream. Child branches are branches that have a parent; a branch without a parent is referred to as the trunk or the mainline. (Wikipedia)

Development / Stable Branches

There is some confusion between what would be a development branch and what would be a stable branch. Not everyone understands the same way and it is necessary to establish this policy in advance.

Development is the branch where the new version of the software is being built and maintained, and could become the major version soon.
Stable is the main and stable version of the software, which can only be bug-fixed. Improvements are always introduced in the development branch.

Some people prefer the MASTER branch to be used for development, always containing the latest version (cutting edge) that may still be being tested. Other people prefer MASTER to be the stable branch, containing software that is the current and mature version of the application.

Many developers push the automatically built code to MASTER, a process known as nightly build. In this case, MASTER is the development branch. There must be a specific branch for each stable version.

If you need to make a bugfix, which branch will you do it on? If the answer is MASTER, will you have to replicate this fix in branches from previous versions?

As you can see, this decision to use MASTER as a development has consequences...

Now, if you use MASTER as Stable, then a version is only incorporated into it when it passes the tests. You know that MASTER is always the current version. If it is necessary to make a bugfix, you can do it in MASTER and replicate it in the development branches if necessary.

If you have a small team and a simple development process, then the alternative of using MASTER as Development is appropriate.

However, if you have a large team and many version developments in parallel, the best alternative would be to leave MASTER as Stable and create branches for each version in development.

In the case of Git we always have a remote repository and a location. If you are going to work on a project, you will usually do a "

git clone

", and you will be automatically positioned on the MASTER branch.

Let's say you want to create a new BRANCH, like "testing", then you would type: "

git branch testing

", creating a new version copied from the MASTER branch, and then type: "

git checkout testing

" to work on that new version.

After committing your changes, it would be necessary to push the branch to the remote repository with the command: "

git push origin testing

These examples can be seen in the Git documentation.

Tags

A tag is not a branch. It is a specific version of the objects. The difference is that you cannot commit changes to Tags, but you can retrieve objects from a particular Tag.

Tags are used to mark software versions, for example, after a bugfix.

In Git we can create tags with the Git Tag command: '

git tag -a v1.4 -m "my version 1.4"

'. Now this version of the objects has been marked as "v1.4" and we can push it to the remote repository with: "

git push origin v1.4

Conflicts and merge

When there is more than one person working on one version of the software, it is possible for simultaneous changes to the same version of an object.

It is good practice before starting your work to check for conflicting changes (which modify something you have modified). The "git fetch" command verifies this, and the "git pull" command effectively catches new versions and modifications.

If there is a conflict, manual intervention is needed to decide what to do: Keep the local version or replace it with the new version. Note that this occurs with each difference found in one or more objects.

Another example from the Git manual:

$ git merge iss53
Auto-merging index.html
CONFLICT (content): Merge conflict in index.html
Automatic merge failed; fix conflicts and then commit the result.

The Git Merge command tries to merge a remote version with the local branch, marking the conflicting files, for example:

<<<<<<< HEAD:index.html
<div id="footer">contact : email.support@github.com</div>
=======
<div id="footer">
 please contact us at support@github.com
</div>
>>>>>>> iss53:index.html

Everything above the "====" sign is your local version, and what's below is the remote version. It is your responsibility to resolve the conflict and run Git Add on each affected file to solve the problem.

Ideally, there should be no simultaneous changes in the same object, which denotes poor project management. But it can happen in case of bugfixes or optimizations made without consulting teammates.

If they are happening frequently it is better to start managing each individual task, improving the impact analysis and passing the responsibility to only one person for changes in objects. And NEVER EVER make changes without discussing with the team.

Continuous Integration

Another advantage of an SCM process is Continuous Integration, a process that serves to integrate the components of a software, compiling it and running tests with each change made in the remote repository.

There are several tools for this, such as: Jenkins and Gitlab. This process is ideal because the software is compiled in a "clean room" environment, not on a developer's workstation.

Cleuton Sampaio, M.Sc.