Home, Bangkok, Thailand, 2019-10-23

#software_engineering #devsecops

Here’s the versioning scheme I’ve used in my CI builds for many years:

With all elements being numeric. These elements are determined as follows:

  • Major - Chosen by author
  • Minor - Chosen by author
  • Revision - Derived from the VCS to indicate which commit the build was made from
  • Build - Incrementing number maintained by the CI tool that indicates which build run this was.

Not that it’s rocket science or even particularly unique, but for the sake of referencability from here on I’m going to refer to this scheme as modver - for module versioning.

modver conveys some pretty simple information: the which revision (i.e. commit) the build is from (as a count / age) and how many times the build has been performed. Version numbers will always be unique even if a build fails due to a build system fault because the build number for the retry will be incremented on top of the same revision number.

What about semver?

Although you can increment the major and minor elements according to the rules defined by semver, modver is not compatible with semver because the third component (revision) is automatically derived and is never reset to zero as required by semver:

Patch version MUST be reset to 0 when minor version is incremented.

semver also defines only major, minor and patch while I like to have the fourth element of “build”:

A normal version number MUST take the form X.Y.Z where X, Y, and Z are non-negative integers, and MUST NOT contain leading zeroes. X is the major version, Y is the minor version, and Z is the patch version.

I don’t care much for semver. In my view it attempts to encode so much information and nuance into three digits as to be rarely useful at the component level. Just because minor has bumped doesn’t mean the part of the API that I use has changed so I may have no issue, and just because you didn’t bump minor doesn’t mean I can trust you - maybe you made a mistake when updating your version number. I still need to do a full regression test when I integrate your new version. So in the end semver might be indicative but it doesn’t really solve the problem of safely integrating third-party API’s.

Also often overlooked by those with whom I’ve discussed semver - it only applies to components with an API (broadly defined):

Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it SHOULD be precise and comprehensive.

There are cases that fall outside this rule but where you still need versioning. For example how about a batch processing job that has no API surface area of any type - semver is not applicable here. I also find it’s precedence rule for pre-release versions naïve - it sort-of-works in their example because “alpha” lexically sorts before “beta” which is before “rc” - but that’s just a coincidence. What if you have “nightly” and “beta” where nightly is definitely newer than beta - semver will regard beta as having precedence.

Deriving the modver Revision with Git

As I come across specific instances where I apply modver (e.g. I’ll be doing some CodeBuild builds using it soon) I’ll document those.

For now let’s see how you can derive revision numbers when using Git which is a distributed VCS and does not naturally have a linear numerically identified commit sequence like Subversion does.

Git identifies each commit with a sha1 hash which is immutable - however history can be changed e.g. by doing a force-push which could add or remove past commits. Also if you re-base then your commits get replayed, creating new hashes on the target branch, meaning the count from your feature branch will be different to the count from your target branch.

We can derive a commit number for a branch by simply counting the commits that lead up to HEAD of that branch, however given the potential for Git history to change that number could be unreliable.

We can rely on a count-derived revision number if:

  • We only ever release and therefore derive versions from one branch - ideally master. If you’re practicing trunk-based development then this is easy. If you’re using GitFlow (my condolences) then this would be your develop branch.
  • Don’t allow force-push to master. You shouldn’t allow this anyway because if you do all hashes change meaning your entire teams clones are no longer valid and will require surgery.

Okay so with those pretty easy to meet constraints identified, we can now derive our numeric revision number as follows:

revision=$(git rev-list HEAD --count)
echo $revision

Or on older versions of Git without the --count switch:

revision = $(git rev-list HEAD | wc -l)
echo $revision

CI Implementations

As mentioned, as I work on future projects that use different CI tools, I’ll provide drop-in code snippets you can use to version artifacts using modver and in the next days or weeks will post the first one of these on implementing modver with AWS CodeBuild.

Update 2019-11-17 see this new post for implementing modver in AWS CodeBuild