Facebook has not been having a good week, facing the second wave of downtime, the most it has faced in four years. This time it has been throwing out continuous error messages which was due to an internal system error.
With a blog post they have explained the outage was caused by an error with an automated system used by the website. This system is meant to find and replace invalid configuration values and replaces them with configuration values that it thinks are updated values from a good persistent database. However, during the outage, the cache was replaced by erroneous values from the usually correct persistent value database. This persistent database was slightly changed by Facebook and was then interpreted as invalid.
When this happened then every client saw the invalid value and attempted to fix it. This fix unfortunately requires making a query to a cluster of databases and this was overloaded by hundreds of thousands of queries each second.
KitGuru says: Facebook are under pressure recently, especially after the high profile attack from an Australian 17 year old hacker.
they are messing things up lately, but at least it will help them sort out security and bugs. maybe even hire a larger team now their worth is growing quickly.