Today I made intermission public. As I mentioned in my post about mysql_role_swap we’ve been working hard to limit / eliminate the impact our operations maintenance tasks have on our customer’s experience.
A few people noticed the /tmp/hold “leftover” in mysql_role_swap script. intermission is a product of that early exploration with coordinating database maintenance with request pausing in the web application tier. I’ve done a good bit of non production testing with intermission, but only limited production testing.
Last Friday we used intermission with mysql_role_swap to move Writeboard’s database to a new server. We had a single user facing exception, and we think it was likely caused by something other than the maintenance. For Friday’s maintenance we enabled request pausing via intermission, ran mysql_role_swap, restarted the unicorn (rails) processes, and then unpaused the requests. Total maintenance time was just a few seconds!