Continuous Integration is a great idea, and usually pretty simple to implement for simple projects. However, these simple projects don’t really exercise the “integration” aspect of the idea. As he build and test process for a project grows in complexity, it almost always grows in duration, too. Typical enterprise Java projects, for example, might fetch dependencies from maven repositories, compile several code modules, copy, move and transform various resources, run unit tests, assemble and deploy jar files, start servers and run integration tests, and so on. All of this can take quite a while, even on a fast build server.
(cartoon from the great xkcd)
One big problem with growing build times, is the effect it has on feedback. If a developer has to wait 10, 20, 30 minutes or more for a build cycle to complete before test results are available, it usually leads to one of three outcomes:
- Every small change requires a concentration-breaking delay to see if it works before moving on to the next change. Development slows to a crawl, management cracks the whip and tries to ban casual web surfing, private email and facebook.
- developers give up waiting for the CI results and press on with development anyway. The code base fills with bugs and issues. The CI process becomes largely irrelevant, as builds are almost always broken.
- Developers hold off from checking in small code changes for fear of having to sit and wait for CI to catch up. As check-in size increases, so does the frequency of code clashes and the difficulty of merging different strands of work. Team culture shifts from collective ownership to silos and hoarding.
What’s needed is a way to get fast feedback, even when a full build takes a long time. Almost every team I have worked with in recent years has tried to achieve this, usually using the open source “Jenkins” (or its fork-parent “Hudson“) build server. So far this has never quite worked.
The main problem seems to be the monolithic nature of a Jenkins build. A build runs to completion (or to a fatal failure), accumulating build data and test results. Data and results are only available at the end. A more useful approach might be if build data and test results were made available as soon as possible, even while further build activity continues. Better still would be a way of adapting the build process to emphasise early feedback, preferring build steps which give feedback to those which are merely useful for further processing. That way a trivial compilation error or test failure in a stand-alone part of the code might give almost immediate feedback. This is not only useful because of the speed of feedback, but because of the effect it has on development habits. Faster feedback would come from code with less coupling and fewer dependencies – any developer wishing to progress more quickly would be automatically encouraged to write (or refactor towards) small, loosely-coupled, independent, well unit-tested, re-usable code.
Although I’m tempted to think that this kind of really effective continuous integration would best be based on different build software, there are a lot of people working to improve things with Jenkins. A recent blog post from “Antagonistic Pleiotropy“: Implementing a real build pipeline with Jenkins. looks interesting, but shows just how tricky even a relatively straightforward build pipeline can be to configure.
Has anyone got any better suggestions on how to achieve effective feedback while building complex systems?