Monitoring Sucks (and what you can do about it)

I want to convince a probably skeptical audience of developers that:

  • Monitoring running software is sexy (and important)
  • The state of the art for monitoring software sucks
  • That riches (or at least really interesting technical challenges) await those that help fix it

Once we've shipped our latest software product we should want to keep an eye on how it behaves out in the real world. Too often this task is an afterthought, completely separate from the development process and focused only on what you get free with one tool or another. This is because even the best tools we have suck.

I'll show some examples:

  • Truly horrible web user interfaces
  • Systems that don't work within a dynamic cloud environment
  • Horrible solutions to scaling
  • Excessively verbose configuration languages

And then talk about what the community are doing to hopefully fix this problem and why as developers people should join the effort. I'll show some hopeful signs:

  • Interesting new tools
  • Better ways to integrate monitoring into the development process
  • The #monitoringsucks open source research

For those intrigued, it's a perfect opportunity to dive head-first into some really interesting programming challenges:

  • Dynamic, distributed systems
  • High traffic, low latency software
  • Visualisation
  • Big data analysis
  • Statistics and prediction
  • Gareth is a developer and occasional systems administrator, currently working for the Government Digital Service on the single domain for all UK government sites, GOV.UK. He also curates devopsweekly, a weekly email newsletter for people interested in web operations. He'll automate anything that isn't nailed down.

    Online you can find him writing code in a variety of languages. Today that's Ruby and Clojure, yesterday it was Python and tomorrow it will likely be something else.