Assembla on Premature Integration or: How we learned to stop worrying and ship software every day

An excellent article from Michael Chletsos and Titas Norkunas at Assembla, which reminded me how important it is to keep anything which might fail or need rework off the master branch.

It’s a truism about software development that you never know where the bugs will be until you find them. This can be a real problem if you find bugs in an integrated delivery, as it prevents the whole bunch from shipping. Assembla have some interesting stats about how they have been able to release code much more frequently by doing as much testing and development as possible on side-branches.

Read more at: Avoiding Premature Integration or: How we learned to stop worrying and ship software every day.

Build pipelines with Jenkins

Continuous Integration is a great idea, and usually pretty simple to implement for simple projects. However, these simple projects don’t really exercise the “integration” aspect of the idea. As he build and test process for a project grows in complexity, it almost always grows in duration, too. Typical enterprise Java projects, for example, might fetch dependencies from maven repositories, compile several code modules, copy, move and transform various resources, run unit tests, assemble and deploy jar files, start servers and run integration tests, and so on. All of this can take quite a while, even on a fast build server.



(cartoon from the great xkcd)

One big problem with growing build times, is the effect it has on feedback. If a developer has to wait 10, 20, 30 minutes or more for a build cycle to complete before test results are available, it usually leads to one of three outcomes:

  • Every small change requires a concentration-breaking delay to see if it works before moving on to the next change. Development slows to a crawl, management cracks the whip and tries to ban casual web surfing, private email and facebook.
  • developers give up waiting for the CI results and press on with development anyway. The code base fills with bugs and issues. The CI process becomes largely irrelevant, as builds are almost always broken.
  • Developers hold off from checking in small code changes for fear of having to sit and wait for CI to catch up. As check-in size increases, so does the frequency of code clashes and the difficulty of merging different strands of work. Team culture shifts from collective ownership to silos and hoarding.

What’s needed is a way to get fast feedback, even when a full build takes a long time. Almost every team I have worked with in recent years has tried to achieve this, usually using the open source “Jenkins” (or its fork-parent “Hudson“) build server. So far this has never quite worked.

The main problem seems to be the monolithic nature of a Jenkins build. A build runs to completion (or to a fatal failure), accumulating build data and test results. Data and results are only available at the end. A more useful approach might be if build data and test results were made available as soon as possible, even while further build activity continues. Better still would be a way of adapting the build process to emphasise early feedback, preferring build steps which give feedback to those which are merely useful for further processing. That way a trivial compilation error or test failure in a stand-alone part of the code might give almost immediate feedback. This is not only useful because of the speed of feedback, but because of the effect it has on development habits. Faster feedback would come from code with less coupling and fewer dependencies – any developer wishing to progress more quickly would be automatically encouraged to write (or refactor towards) small, loosely-coupled, independent, well unit-tested, re-usable code.

Although I’m tempted to think that this kind of really effective continuous integration would best be based on different build software, there are a lot of people working to improve things with Jenkins. A recent blog post from “Antagonistic Pleiotropy“: Implementing a real build pipeline with Jenkins. looks interesting, but shows just how tricky even a relatively straightforward build pipeline can be to configure.

Has anyone got any better suggestions on how to achieve effective feedback while building complex systems?

Deployment pipeline anti-patterns

It’s happened on most reasonable sized projects I have worked on. The benefits of test coverage an continuous integration are obvious and pay back immediately. But, somehow, as the project grows and diversifies, a point is reached where the complexity and run time of the CI process begins to slow down development rather than assist it.

Jez Humble has put together some interesting thoughts on how to deal with this issue. Read more at Deployment pipeline anti-patterns.

Optimise your team

Coping with difficult times is a topic of the moment. Jared from Agile Artisans writes about optimising a team.

Agile Artisans::home.

Tactics, Strategy and SOA in the cloud – conflicting views

I’m in two minds about Service-Oriented-Architecture (SOA). On the one hand it seems obvious that future systems will need to inter-operate increasingly in order to gain business benefits without requiring complete software development projects. On the other hand, I am distinctly under-impressed by the current approaches to SOA, and even by the emphasis on services rather than the equally applicable resources, messages, or processes as the integration building blocks.

Here are a bunch of conflicting views on this area which have collected in my “blog this” queue over the last few days:

InfoQ Article: Will Cloud-based Multi-Enterprise Information Systems Replace Extranets?

Will Cloud-based Multi-Enterprise Information Systems Replace Extranets? (confusingly, a different article with the same title!)

Meme Agora: Tactics vs. Strategy (SOA & The Tarpit of Irrelevancy)

Enterprise Java Community: Extend the Data Grid With Hub-less Messaging

Services & Workflows: SOAP and REST with WCF and WWF

Year of the cloud

SOA equals Integration?

Searching for the perfect project hosting

I’m still searching for decent project hosting. I now have several projects on the go, and several others bumping around in my head, and the fuss and bother of tying together all the various bits of a distributed software project development is making my head hurt.

All the bits I need are available separately, but so far I have not been able to find any single provider (free or paid for) which offers the combination of features I need. Essentially these are:

  • Version control. Ideally git, but at a pinch one of the other distributed VCS tools or even subversion would probably do if everything else was in place. GitHub seems good for this.
  • A project wiki. Using any other system for project docs just seems so clumsy. There are plenty of these; I use WikiDot for one project.
  • Sensible bug/feature tracking. This is a bit more tricky – there is plenty of bug-tracker software, but not much that works equally well for managing unimplemented feature stories and associated tasks. Ideally this should link in with the version control, allowing code and change metadata to be updated in one go. Trac seems a possibility for this.
  • Calendar management. For recording and communicating meetings, deadlines etc.. Something which works well with calendar syndication, so that anyone working on the project can see project events in with the rest of their appointments. Plenty of these: Google calendar, 30 boxes, etc. They all have their quirks, though.
  • Task (todo) management. I find it amazing that task management is so poor in on-line calendars. There are standalone task tools such as Remember The Milk, but it is integration which is needed.

There are also a few other features which are definitely in the “useful to have” category, but I’m practical enough to use manual or off-line tools if necessary.

  • Effort recording, tracking and reporting. For velocity tracking, process improvement, and even billing.
  • Collaborative planning and prioritisation. Mingle tries to simulate a task wall, but is somewhat clumsy and irritatingly expensive; I have heard of on-line tools to run “Planning Poker” sessions, but as usual, not integrated with anything else.
  • Continuous Integration. I’m not aware of any really smart tools to make use of distributed version control for this, yet. Our Cruise Control installation just stops and complains when something breaks, for example, but it should be possible to just “park” the failing patch and continue building with others in a real dvcs-based approach.

If anyone has any suggestions – or wants to build a product which does all this stuff – please let me know!

For interest, here are a few associated links.

Cuberick: Distribute Your Software Just Like Ubuntu With Launchpad

Comparison of open source software hosting facilities: Wikipedia

Application Integration Through Mail Servers

A neat article, even though some people don’t seem to see the point of it. For me it’s a useful summary of some potential application issues around machine-machine email communication.

InfoQ: Application Integration Through Mail Servers

The REST Dialogues

When I first encountered Duncan Cragg’s “REST dialogues” I was not sure how they would develop. As I have read more, I have become progressively more impressed. Cragg uses the style of a Socratic dialogue with an imaginary “eBay architect” to teach about the nature and use of REST techniques as an alternative to more traditional approaches such as SOAP and SOA.

Recent dialogues on Business Conversations and The Distributed Observer Pattern are particularly thought-provoking, but the whole growing series is definitely worth a read.

The REST Dialogues

Testing web services with ActiveResource

When I first saw this it looked great: a ruby REST wrapper which supports a lot of useful test and integration possibilities. However, the deeper I looked, the more disappointed I became. I’m now saddened to believe that this is based on yet another misunderstanding of what REST is.

As far as I can tell, the ActiveResource concept, on which this approach is based, is merely an attempt to impose a constrained CRUD data-model over HTTP. There is no concept of content negotiation – all data is XML according to pre-assumed schemas. There is no support for automatic discovery and use of hrefs between resources. There is even the suggestion that it’s a good idea to casually extend the HTTP protocol with extra custom methods.

All of these are typical problems found in projects which use the name REST without the key bits which make it really work. The end result is just another fragile, application-specific RPC protocol. Sigh.

nutrun » Blog Archive » Testing web services with ActiveResource

User stories in the Enterprise Integration space

The discipline of writing good user stories – ones which communicate clearly to all appropriate stakeholders, give enough information for effective discussion, yet leave enough freedom for innovative solutions – is surprisingly tough. Writing such stories for integration tasks is harder still. Shaun Jayaraj has some thoughts:

What to do? we are like this only: User stories in the Enterprise Integration space

Network Simulation and Emulation: Try It Before You Deploy It

A potentially interesting, although buzzword-laden, article about the benefits of setting up and using simulations to determine how complex integrated applications will behave. Needs some links to more detailed discussions, though.

Technology News: IT Management: Network Simulation and Emulation: Try It Before You Deploy It