Semantic Versioning and Continuous Integration

About a year ago, at the Bndtools Hackathon, we implemented support for semantic versioning. In a previous article I already explained the basics of how this works, but during that year, I’ve learned a thing or two that I want to share.

Update: I mixed up the consumer and provider type annotations in my story. That has been corrected now.

Consumer and Provider types

An API usually consists of interfaces, to abstract away from any concrete implementation. The API is what’s exported and versioned. Semantic versioning defines what each part of the version means. In the context of an API this is either a “major” or backward-incompatible change (like changing the signature for a method) or a “minor” or backward-compatible change (like adding a new method). However, in the real world things are never that simple. Let’s take EventAdmin as an example. If you want to use that API to send an event, you consume the EventAdmin service and invoke a method on it. That is actually a pretty common use case. If we add a new method to EventAdmin, this would be a minor change, backward compatible with existing users as this is an interface that is “provided”, so adding a method does not change anything for them. However, when you want to listen to events, you need to implement¬†EventHandler and register it as a service in the service registry. When we would add a method to EventHandler, it would suddenly not be a backward compatible change as in Java (prior to Java 8 and its default implementations) you must implement all methods of an interface. In other words, existing users now “consume” the interface, and adding a method breaks their implementations. So, when designing an API, we must somehow express that an interface is either a “provider” or “consumer” type. For this, Bndtools added two annotations: @ProviderType and @ConsumerType. When you annotate your interfaces like this, semantic versioning will take this into account and will correctly bump either the minor or major version when you add a method to a type.

To leverage this feature, you must put the annotations on your build path. If you want to do that for your whole workspace, simply add it to cnf/bnd.bnd like this:

-buildpath: biz.aQute.bnd;version=2.2.0

And in the build path of each project, inherit from this build file like this:

-buildpath: ${^-buildpath}, ...

Note that if you have not applied these annotations¬†to your interfaces yet, Bndtools assumes a “worst case scenario” and marks everything as @ConsumerType. Adding annotations afterwards means you’re changing your interfaces, so keep that in mind when preparing for semantic versioning. We forgot to do this in Apache ACE for the 1.0.0 release and had to fork that version to add the proper annotations after the fact.

Building in and outside of Eclipse

As explained in the earlier article, the baselining feature of Bndtools is based on bytecode analysis. If you are experienced in Java development, this statement will surely trigger an alarm bell, since the actual bytecode that is generated by the compiler could differ based on various debugging and optimization settings of the compiler, and even the version of that compiler. This will not actually affect Bndtools’ ability to analyze changes in interfaces, but it does affect its ability to detect “bugfixes” or in general “changes” in bytecode. If you use different compilers, you end up getting a lot of “false positives” where your bytecode does change even if your sourcecode remained the same. To prevent that, you need to make sure of a couple of things:

  1. You must make sure that you always use the same compiler version and settings to build the code. Since Eclipse always uses its own, internal compiler (called ECJ), your best bet is to use that compiler for your off-line build as well (instead of javac). Also, you must make sure its settings are the same. Something to be aware of here is that the default settings for ECJ inside Eclipse are different from those when you run ECJ outside of Eclipse. My advice here is to:
    • Make sure that all projects in the workspace use the workspace default settings for their compiler and JRE and that the same JRE is used on your build server.
    • Setup the workspace to use a specific Java version, and in the Java/Compiler tab under “Classfile Generation” disable the “Preserve unused (never read) local variables” to ensure the same settings are used inside and outside of Eclipse.
    • Ensure that your off-line build uses the same ECJ as Eclipse does (see below) and make sure that all developers in your team use the same Eclipse version.
  2. You must ensure that the repository that you baseline against is always available. If, for some reason, it is not, you end up with a workspace that becomes impossible to use because nothing compiles anymore. How you solve this depends on your faith in on-line repositories, but one way to solve this is to release all your latest versions of bundles into a local repository that you commit to your revision control system. That way, your build is “self contained”.

Setting up ECJ

To install ECJ, you need to ensure you use the same version as Eclipse. That can be done by running the Ant task below from within Eclipse, making sure not to fork a new process:

<!-- ECJ Compiler support, add to cnf/build-template.xml and invoke
      the "install-ecj" target. -->
 <target name="install-ecj"
         depends="init, install-ecj-eclipse, install-ecj-noeclipse" />

 <target name="install-ecj-eclipse" if="eclipse.home">
  <echo message="Eclipse is installed at ${eclipse.home}." />
  <mkdir dir="${workspacedir}/cnf/ecj" />
  <copy tofile="${workspacedir}/cnf/ecj/ecj.jar">
   <fileset dir="${eclipse.home}/plugins">
    <include name="org.eclipse.jdt.core_*.jar" />
   </fileset>
  </copy>
  <unzip src="${workspacedir}/cnf/ecj/ecj.jar"
         dest="${workspacedir}/cnf/ecj">
   <patternset>
    <include name="jdtCompilerAdapter.jar" />
   </patternset>
  </unzip>
 </target>

 <target name="install-ecj-noeclipse" unless="eclipse.home">
  <echo message="Please run this ant task from within Eclipse
                 and make sure you're using the 'same JRE as
                 the workspace' setting." />
 </target>

After running this task, I recommend committing the cnf/ecj folder to your revision control system. Next up, you need to setup the compiler to actually use ECJ. You can do this by modifying the compiler task:

 <target name="compile" depends="dependencies" if="project.sourcepath">
  <mkdir dir="${project.output}"/>
  <componentdef name="ecj"
   classname="org.eclipse.jdt.core.JDTCompilerAdapter"
   classpath="${workspacedir}/cnf/ecj/ecj.jar
              :${workspacedir}/cnf/ecj/jdtCompilerAdapter.jar" />
  <javac fork="yes" executable="${javac}" srcdir="${project.sourcepath}"
   destdir="${project.output}" classpath="${project.buildpath}"
   deprecation="true" listfiles="true" target="${javac.target}"
   source="${javac.source}" debug="${javac.debug}"
   includeAntRuntime="no" verbose="${verbose}">
   <ecj />
  </javac>
  <copy todir="${project.output}" verbose="${verbose}"
   preservelastmodified="true">
   <fileset dir="${project.sourcepath}">
    <exclude name="**/*.java" />
    <exclude name="**/*.class" />
   </fileset>
  </copy>
 </target>

That should be enough to use ECJ in your off-line build. If you want to study a working example of this setup, take a look at the Apache ACE build, which uses ECJ and a local release repository to baseline against.

Versioning build server artifacts: snapshot or timestamp or?

Once you’ve actually setup semantic versioning in your IDE and continuous build, and want to start deploying such builds to testing and QA environments, you bump into the next issue: how do I version my build server artifacts? With baselining enabled, the latest released versions are in your release repository, and that is what you baseline against. As soon as you start making changes, you will be forced to bump the version of certain packages and bundles, so assuming your latest release of a bundle was 1.0.0, the version in your “master” or “trunk” repository might be 1.0.1, 1.1.0 or even 2.0.0. However, you cannot simply release that bundle with that version for two reasons:

  1. The version in your master repository will only become that version if you would actually release the bundle at that point in time. Whilst in development, the only thing you know is that the bundle should get a “newer” version than 1.0.0 (unless you revert all changes), but it’s impossible to predict what version it will become. Also, until you release it, you don’t want to build anything with that version as that would create ambiguity: suddenly there can be more than one different bundle with the same version. Not something you want.
  2. Every time something changes in that bundle, you want to give it a new version, so you cannot just use “any” fixed version that is present in master. You want to give it a version that changes if and only if something has actually changed.

Let’s look at two popular ways of solving this issue and their “problems”: snapshots and timestamps. Let’s keep our example bundle, that was released as 1.0.0 and assume that we’ve made a minor change in master.

Snapshots

Snapshots are widely used in the Maven ecosystem. As soon as you release a version with Maven, you bump the version in master and add “-SNAPSHOT” as a qualifier. In our case, this would then be “1.1.0-SNAPSHOT” or, to make the version compatible with OSGi “1.1.0.SNAPSHOT”. Let’s now see what problems exist with this approach:

  1. If, eventually, we want to release this bundle as 1.0.0, we have an ordering issue as 1.1.0.SNAPSHOT > 1.1.0 in OSGi. This can be blamed on the difference between versioning in OSGi and in Maven, but it is one reason why this approach is not very practical.
  2. As soon as we’ve released one snapshot and we change our code in master again, we want to create a new, different version. If we don’t do that, versions become meaningless, as we can no longer assume that if a version is the same, the contents of the bundle is the same. Taken to the extreme, this means that we should always update all our bundles, just to be sure, and as soon as we do that we might just as well package everything in one big JAR file as we’ve lost one important aspect of modularity: updating only what has changed.
  3. Even if nothing changes in the bundle, we have already changed its version to 1.1.0.SNAPSHOT. We could solve this one by not changing the version until we actually making a change, but that still leaves the first two issues.

Timestamps

A slightly different approach from using snapshots is using timestamps. In a way it is similar to using snapshots, only this time around, instead of using a static qualifier, we use a qualifier that contains a (text sortable) timestamp. Going back to our case, this means we end up with “1.1.0.${tstamp}” which expands to something like “1.1.0.20140422140100”. Again, let’s explore the problems:

  1. We still have the ordering issue, as 1.1.0.20140401083000 > 1.1.0 in OSGi.
  2. The second problem we had with snapshots is reversed. We now have different versions even if nothing changed. Above I already explained why this is bad.
  3. Even if nothing changes, we changed its version. Like with snapshots, this could be solved by only changing the version when something’s changed.

Continuous delivery versioning

So we’ve seen how using snapshots and timestamps does not work. Let’s move on to a solution that does work, which is to create “special” versions for bundles that are created by a continuous build system. Let’s go back to our example and explain how to correctly version bundles:

  1. At the end of the build, we baseline the resulting bundle against the release repository. If it is unchanged, its version in master would still be 1.0.0, and we simply skip it since it has not changed in this build.
  2. If the bundle has changed, we baseline it against the snapshot repository, which contains zero or more “snapshot” versions of our bundle. If it is the same as the latest snapshot version, we skip it, as it has not changed in this build.
  3. If the bundle has changed, we generate a new snapshot version as follows:
    • We take the latest released version as our starting point. We must do that since any bump in major, minor or micro means that we cannot later use that version as a release anymore. So in our example, we start with “1.0.0”.
    • We append a qualifier that starts with an arbitrary string “CDS” (for continuous delivery system) followed by an index that is increased every time, so “CDS0001” and so on.

This solves all problems we had:

  1. The ordering issue is solved, since 1.0.0 < 1.0.0.CDS0001 < 1.0.1 (or any upcoming release).
  2. The bundle version only changes when its content changes.

Wrapping things up

We covered setting up semantic versioning, making sure we mark our APIs correctly, configuring projects and the ECJ compiler so baselining works both in Eclipse and outside of it. We also covered setting up a continuous build system to version “snapshot” bundles correctly. Whilst the whole setup is not trivial, the benefits of having semantic versioning are big enough to make it worthwhile. If you have any further questions, please get in touch!

Posted in Report Tagged with: , , , ,