Why we need version control (featuring git)

03.04.2014 Jordan West

Version control is no longer a nice-to-have, it’s a necessity for any software development regardless of team or project size. Yet it seems to have been neglected in the automation industry, with few tools providing any integration with a version control system (VCS). Some vendors including Siemens have provided proprietary VCSs with their tools which is certainly a step in the right direction, however there is still a long way to catch up with the rest of the software world.

In this post I’ll be going into the advantages of version control using the highly popular VCS git as an example. Git was born and raised by the same father of Linux, and is today used as version control for millions[1] of projects including the core Linux kernel.

Travel back in time
In git, a ‘commit’ is considered the smallest unit of change to functionality. Developers should commit early and commit often. Changing the startup sequence and fixing an unrelated bug in the engine control? These should be two separate commits.

Every commit has a date and time associated with it, and hopefully a message by the author describing the changes that were made as part of the commit. This means that when something goes wrong, it’s possible to revert the entire project state back to an older version.

When each commit is a unit of functionality change, it becomes far easier to trace a bug that was introduced. It’s then possible to revert back through commits until you find out which commit introduced the bug, and you’ve narrowed your search space.

Git also provides a ‘blame’ feature, which lets you identify who last edited a particular line of code and in which commit. This isn’t only useful when blaming someone for writing dodgy code; it can also be helpful when you want to know who to ask about how a line of code works.

When you have multiple developers working on a single project at the same time, it can become a challenge to merge each team member’s changes into a single coherent project. Git handles this automatically – working out what exactly each developer has changed and even merging separate parts of the same file together. If two developers have modified the same part of the project, git provides a procedure for conflict resolution (that’s conflicts in the source code – conflicts between developers are a matter for HR).

Distributed VCSs including Git and Mercurial are designed with high availability in mind. No central server is required (although one is usually used) for teams to share their code changes. If you do have a central server, it’s no big deal when it goes down. Your team can share changes with each other over the LAN, via email, with USB keys, or using smoke signals.

Want to pull Steve’s updates? Once you’ve configured Steve’s git URL, you can simply:

> git pull steve master

Need to push your changes back up to the central server?

> git push origin master

In git, ‘origin’ means the location you originally got your code from.

A version control system lets your project grow like a tree. Starting from a single point on the ground, branches grow out in all directions. However version control also gives your tree a supernatural ability – you can merge its branches together.

Since everybody has their own repository, each developer can have their own set of private branches that they choose to share or not share with others. This can be useful for example if a developer wants to experiment with a few different ways of coding something without polluting the master repository.

It is common practice to have a branch for production code, and a branch for development. Changes are then merged from the development branch into production after being thoroughly tested and considered ‘stable’.
Stash your changes away

Say you’re in the middle of developing a new feature and your project isn’t building, when suddenly you’re tasked with an urgent bug fix. No problem – just stash away your changes for later.

> git stash

Once you’re done with the bug fix, committed your code and pushed to the server, get your half-finished project back with:

> git stash pop

Automating with hooks
When somebody pushes code to the central server, you might want the code to be sent through your automated test suite to make sure they didn’t break the build. You can do this with git’s hooks. On the central server, you can modify the post-receive script in the project/.git/hooks folder to execute commands when somebody does a git push. For more info on hooks, see the git docs (http://git-scm.com/docs/githooks.html).

The Problem
All of these benefits are great, but there remains a hurdle to taking advantage of them in the automation industry. The majority of tools available for developing SCADA and PLCs are not friendly to VCSs. In this industry, “source code” is often stored in proprietary binary formats, usually in a single file.

Few VCSs are particularly good at managing large, binary files. Git is far better suited to textual files where it can determine differences between files and recognise new and modified lines of code. There are a few ways to work with large files in git with some extra work to install plugins. However you’ll miss out on the merging, diffs and many other useful features of git that make it worth using in the first place.

Unfortunately until the automation industry catches up, we’re stuck using proprietary built-in version control in the tools that support it. For everything else, manually managing copies of the code is the only option.

[1] https://github.com/blog/841-those-are-some-big-numbers