We all know the power of a classy reboot now and again, right? Well apparently Facebook used it to “fix” their global application recently:
The way to stop the feedback cycle was quite painful – we had to stop all traffic to this database cluster, which meant turning off the site.
Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site.
Whoops. You can read the full (and somewhat shameful) blog post here from Facebook engineering.