Semantic Versioning — A Heuristic for Expertise

January 24, 2018

I'm going to go over what Semantic versioning is, why its useful for software, but mostly, I'll be describing how it can be used as a heuristic and one of the many signals to key in on when interviewing other software engineers.

Why should I care? Because...

I find it hard to accurately determine the expertise of another engineer and I believe it is hard for you too. To do so, to really know how good of an engineer someone is, personally takes me multiple interactions, collaborating and communicating on multiple projects with them. It takes a very long time for me to honestly know how deep their expertise is. Not to mention, to even understand another’s expertise, I would have to be a decent judge of my own skill, which is known to be hard . Basically, to determine the skill of another engineer, I have to exert a non-trivial amount of effort.

Therefore, I find it cheaper and less resource intensive to instead rely on heuristics. Where heuristics, in this context of estimating people's skill, is really just another way of saying I'm biased and judgmental.

Now I know your thinking, "that sounds bad, do you even understand the implications of what your saying!". Well of course I do. I know it sounds insensitive. And I also know that when someone settles on the wrong kinds of biases, they become a poor judge of rank and at the very worst, a racist. And I wouldn't want that kind of thought directed at me either. However, what I'm actually implying, is something a lot less awful than it sounds. So let me define the types of heuristics I use, before we go on.

N is strictly greater then 1

First up, I always use multiple heuristics, emphasis on the plural. I can't, and will not, rely on just one heuristic. I find that the usefulness of multiple heuristics, arrises in every domain. For example, every parent with multiple children will probably say, “My kids are nothing like each other. I had to learn how to raise each of them differently”. And what are engineers but just grown up kids with an accumulated knowledge of the world? Engineers tend to take different approaches, use varing mental models, and approach tasks in their own, unique ways.

every model is wrong but some are useful — George Box

Again, the usefulness in being able to leverage multiple heuristics also parallels the common advice of: “Use the right tool for the right job”. For example, during development, we might use a shell script for simple task automation, but to implement a backend service in the same approach would be insane, Instead we should probably use a better tool, i.e. a language like Go.

Therefore when one finds that a particular heuristic works for one group of engineers, they should not be surprised to find out it doesn’t work so well on another group. Hence, I use multiple heuristics with varying levels of effectiveness depending on the engineers and the environments we find ourselves in.

In the interest of brevity and clarity i'll talk about just one of my heuristics. Eventually, I might find time to write others down, but for now, I think this heuristic is important enough to focus on it alone. For I find it works well for a rather large subset of software engineers which I frequently encounter. Namely, engineers who write libraries and packages.

Here I am using library or package to mean software used within other software. For concrete examples, think of the all the engineers who write React.js, Protobuf clients, SDK's, internal tools, etc. These engineers are sometimes given the title of Platform Engineer, or report into Developer Experience. In general I'm referring to engineers who write reusable code which can then be further leveraged in software written by yet more engineers.

Now with that setup in place I can explain the heuristic:

An engineer's skill level can be determined by how explicit they are in describing not only the benefits of Semantic Versioning but also how to achieve it in practice.

If your not familiar with Semantic Versioning or SemVer for short, than it your probably like “huh, your gonna have to explain.” However if you already know about SemVer and you don't understand it, then your probably thinking, "Well...hey, I'm a great engineer, I know about SemVer but I don't find it all that useful”. Well in that case you're either not in the subset of engineer's which I am referring to, or hopefully I can show you SemVer’s intrinsic value .

To get us all on the same page, SemVer simply provides a consistent language to talk about changes in software. At the simplest level, SemVer makes your version numbers indicate more than just commits, and assigns semantics to the changes in your software. It does this by having different increases in the version number correspond to semantically different changes in the contract, or API, of the code. The result is software which can change without affecting people unexpectedly.

As a simple example, here is a version number, v1.4.8 which follows the standard conventions. It has three parts each split by a period, MAJOR.MINOR.PATCH :

The MAJOR part is increased when you make incompatible API changes.
The Minor is increased when you add functionality which is backwards-compatible.
And the Patch part is increased when you make backwards-compatible updates.

This way of describing SemVer, is how it's listed in the official looking document , but I have seen other slight variations. For example, you might see a label or release appended on to either side, or the changes and version number might be pegged to time. For example, Ubuntu uses versioning, like so. Given, v2.17.04,

v2 would be the major version.
17 is the release year.
And 04 is the release month.

In writing, you also sometimes see the version written with a "v" prefix, as in the previous examples, however in practice, it's often omitted if the context is clear.

The previous description of SemVer is rather brief. If we were to completely cover SemVer and the various ways it is used, this section would get dense quickly. It is best to just know that SemVer is a tacit convention, and not always strictly followed but is a good ideal to strive for given that most people tend to dislike it when you ignore semantic versioning.

What I personally find valuable is in how if I use packages or libraries following the conventions in SemVer. I can assume the following:

For stable software if a specific version works, anything newer within the current major version will also work. For example, if 3.1 works, everything in the 3.X range, where X ≥ 1, will also work.
For unstable software , if a specific version works, anything newer within the current minor version range will also work. For example, If 0.3.1 works, everything in the 0.3.X range, where X ≥ 1, will also work.

With these guarantees in place, I can consume libraries safely, I have a shared language between other developers, and I can better estimate the expected level of effort if I am wanting to upgrade.

So you're probably thinking, “So what? If I have an engineer describe their understanding of SemVer, it is just one tiny aspect of software engineering . Outside of their direct answer what other details of their skill and work can I possibly infer?”

Well here, it's important to remember that SemVer doesn't just provide guarantees. It also helps shape the way we as engineers think and develop software. See, there's this rule of thumb called Hyrum's Law, named after the Google engineer, Hyrum Wright. The shortened rule states:

“With a sufficient number of users, ...all observable behaviors of your system will be depended on by somebody.”
Hyrum Wright

So for example, let’s say you ship a product. You claim it does one thing, when in reality your users discover it can do another. This quirk they discover is definitely not something you intended or even documented. The consequence is now users are going to use this quirk and they will depend on it so much, it might as well be considered a feature.

So how does Hyrum's law relate to SemVer? Well because following SemVer forces you to think in ways to avoid backwards incompatibility. And one way to avoid backwards incompatibility is to have a small changeable surface area to begin with. So if you structure your codebase with less public things to change, a nice by-product will be a codebase with a smaller area for undefined behavior and a smaller chance it will exhibit Hyrum's law in action.

Now going back to why as a heuristic for expertise, I ask engineers to describe SemVer. For if they can, I perceive it as strong signal the engineer can understand so much more:

How backwards compatibility correlates with dependable and reliable software.
How and why undefined behavior and unnecessarily large API's are to be avoided.
And how their software will evolve and live in various contexts over time, such as a partially upgraded ecosystem.

If the engineer can recall examples or describe how SemVer is put into practice for any particular language, I also receive the following signals:

They are more likely to leverage ancillary tools for engineering, e.g. changelog templates, dependency management tools, etc.
They are more likely to have written or supported software that provided tangible value. As an aside, if you write software that is valued and depended on by many others, this alone is a strong signal.
They build api's with minimal surface area, and can reason and articulate the impact on software when given demands to change functionality.
They have dived deep into a specific language paradigm’s which would allow them to write robust and flexible software.
And they will have an awareness of how software will be interacted with, outside of the initially thought of use cases.

Trying to find out this kind of information in other ways, say from an experience working on a project with that engineer, requires way more time. And flat out asking about this kind of information is hard for me to phrase just right.

So instead, I use SemVer and the related ideas I just mentioned as a jumping off point. Using their answers as heuristic to determine far more about the engineer than I could otherwise. Personally, using SemVer in this way during conversation, short cuts a lot of the time and energy required for me to learn about another engineer’s expertise.

Hopefully now you can appreciate what SemVer potentially provides: A structure for your software as it changes; A method to fight off Hyrum's law; And an approach to improve your API’s. And most importantly, you should now be able to trust your judgement of another engineer’s expertise based on how well they understand and describe SemVer in their work.