Freshly hired pilot fish at a Canadian office of this company gets his first assignment: Write an agent that will handle failover for a high-availability cluster controller. Got that?

"It's basically a script to automate failover of a disk array from one server to another," explains fish.

"I write the basic code in a day or two, only to find that although it works most of the time, as with all complex applications, the documentation doesn't quite match what actually happens."

That's no big deal when it's under human control, but it's a disaster for something that's supposed to be fully automated. So fish takes his quick-and-dirty code and sets to work making it bombproof.

To read this article in full, please click here