If the scenario above sounds crazy to you – that’s ok, unfortunately I see it unfold daily.
I’ve noticed that usually when software developers ignore the broken build do not do so out of malice or laziness.
Unfortunately, a broken build means that although someone (perhaps yourself) took the time of automating parts (or all) of the build/test process and all of her hard work is wasted because no one would fix the damn build.
I’ve noticed that when the build system is left broken for a long time is happens due to one of the following reasons:
- No/little build visibility
- Lack of knowledge
- No definition of Individual responsibility
Build VisibilityIdeally anyone every relevant member of the team must know when a build fails. Better yet all of the company should have easy access to the current build state.
Consider the following:
- All of the team has access to the build server by URL
- Email is sent when a build fails to the relevant person(s)
- 60-inch screen in the middle of the dev room shows the current build status
- When a build fails a big red light mounted in the dev room/hallway blinks
- When a build fails a picture of the person who broke the build shows in every screen in every conference room
If you think that installing a build server and making the URL available for the whole company is good enough – I got news for you. People are way too busy to go to that URL and try to understand what the build server is showing them. Adding Email is case of failure is also a good idea but not sufficient – after a few of those some (read: most) developers would learn to ignore them. If you add email notifications on successful build you’ll only make this process (of ignoring builds) happen faster.
A failing build should be visible and impossible to ignore
Another important factor is how easy/hard it is to discover why the build has failed. Not all build servers were created equal – some do a better job of showing the root cause of the failure and some require reading 10 pages of logs. My point is – fixing a broken build happens when you need to do something else (developing software) and as such should be as simple and painless as possible.
Missing knowledgeThis usually a problem if the build script performs too many things. Let’s go back to our imaginary scenario where the build inspector’s shouts about a problem in one of the build’s components – and I’m not familiar with that component or I don’t have the right expertise to fix that particular problem. In that case I’m going to continue working as if nothing happened – or go and grab a cup of coffee until the problem resolves itself.
The problem with big build scripts that does a lot of things is that it’s hard to tell why a specific step (or 100 tests) have just failed and then everyone on the team get a bad case of “it’s somebody else’s problem”.
The right solution is to try and split the build into several individual builds where each team (and each team-member) knows exactly where their responsibility (development wise) starts and ends.
Individual ResponsibilityIn the heart of a healthy process lies personal responsibility and integrity.
I would avoid shaming (e.g. show build breaker name on all conference screens) and instead try to understand why people don’t care that the build is broken. Usually it has something to do with one of the previous points and not because of lack of commitment.
ConclusionA broken build is not a pretty sight and should be fixed as quickly as you can. The good news is that it’s easily solved with the proper tools, education and plain old nagging. As long as you take the time to understand what are the reasons other talented developers seem content of leaving it broken.
Try it out – you might be surprised to find out that you’re not the only one who cares.
Until then – happy coding…