Facebook blamed a massive outage that stretched into Thursday morning on a server configuration change.
The world’s largest social network said it has resolved the issue, which affected millions of Facebook, Instagram and WhatsApp users starting Wednesday.
“We made a server configuration change that triggered a cascading series of issues. As a result, many people had difficulty accessing our apps and services,” Facebook said in a statement. “Our systems have been recovering over the last few hours.”
Server configuration describes how the parameters of a computer system or program are set up. These, in turn, can govern how traffic is routed and what happens to it within the system.
The sheer complexity of such massive networks can make it difficult to figure out what’s causing a problem, especially if it’s intermittent, said Sandy Bird, chief technical officer of computer security firm Sonrai Security. “I’m sure there were a lot of people up all night working on this.”
Image of a Facebook logo taken on a mobile phone. (Photo: Loic Venance, AFP/Getty Images)
Internet companies experience occasional outages, but this one was remarkable for its global spread and duration, lasting nearly 24 hours. The outage intermittently interrupted service on all of Facebook’s apps, making it the worst outage in the company’s history.
In an unrelated development, Facebook announced Thursday two key executives, including its chief product officer, were leaving the company. Chris Cox, one of Facebook’s highest-ranking executives and one of its earliest employees who joined in 2005, and Chris Daniels, the head of WhatsApp, have resigned, Facebook CEO Mark Zuckerberg said in a post.
Zuckerberg said he had no “near-term” plans to replace Cox. Daniels, who has run WhatsApp since the departures of the app’s founders Jan Koum and Brian Acton over disagreements with Zuckerberg last year, is being replaced by Will Cathcart.
These are just the latest executive departures as Facebook grapples with a string of troubles, including Wednesday’s marathon outage.
A Facebook outage in 2008 that lasted about a day affected many of its then 80 million users. Today, Facebook has about 2.3 billion users who log into the service at least once a month. Facebook estimates that 2.7 billion people use its various apps and more than 2 billion log into these services every day.
The problems began spreading Wednesday at about 11 a.m. EDT. Users reported service disruption around the globe, but the outage appeared to be most pronounced on the East Coast and in the U.K., according to DownDetector, a service that monitors outages. Facebook said Thursday the problem was triggered by a change it made to its server settings on Wednesday.
Yesterday, as a result of a server configuration change, many people had trouble accessing our apps and services. We've now resolved the issues and our systems are recovering. We’re very sorry for the inconvenience and appreciate everyone’s patience.
— Facebook (@facebook) March 14, 2019
Some Facebook users got a message that the service was down for maintenance. Others could log on but their news feeds were empty or they could not post updates. At times, updates from friends were visible but users could not comment or like them. On Instagram, profiles would not load. WhatsApp users reported not being able to send messages.
The service interruption had a substantial ripple effect, affecting access to some services where users log in with their Facebook credentials. Users of Oculus virtual reality devices also reported having problems.
Frustrated users flocked to complain on Facebook rival Twitter, where one of the trending topics on Wednesday was #FacebookDown.
This is not the first time server configuration changes have wreaked havoc and, given the complexity of these systems, it’s unlikely to be the last.
For a network as large as Facebook, resetting configurations is a delicate operation. Modern computer systems are built with “phenomenal amounts” of redundancy and self-healing embedded in them, Bird said.
That can mean that any small change can replicate across the entire network, creating a cascade of errors. “You can make one mistake and has consequences everywhere,” he said.
It’s not known what specific error occurred. It might have been something as simple as creating a rule that if a server is having performance problems, it restarts itself. But the system might interpret a server that’s in the process of restarting as experiencing poor performance, and shut it down again, creating a loop that spreads throughout the network.
“So the thing that’s supposed to self-heal things actually makes them worse,” Bird said.
Last November, some but not all users were hit with a 13-hour outage on Facebook, Instagram and WhatsApp. At the time, Facebook said, “Earlier today, a server configuration caused intermittent problems across all apps globally creating a degraded experience for users. The issue has since been resolved, we are back to 100 percent for everyone and we’re sorry for any inconvenience.”
Instagram announced early Thursday that service had been restored. At 12:41 a.m. EDT, Instagram alerted its users with a tweet that said, “Anddddd… we’re back” accompanied by a jubilant gif of Oprah Winfrey.
The outage affected millions of advertisers who rely on Facebook and Instagram to connect directly with consumers. Brand marketers tweeted Wednesday that Facebook’s ad-buying system was also down. Facebook said Wednesday that it might offer refunds to some advertisers.
The downtime was likely very costly for Facebook. Facebook is projected to generate average daily revenue of about $189 million based on 2019 sales estimates, according to analyst estimates.
It was not the only bad news Facebook received Wednesday. The New York Times, citing unnamed sources, reported that a New York grand jury has issued subpoenas as part of an investigation into Facebook’s consumer data-sharing deals with other tech companies including Amazon, Apple, Microsoft and Samsung.
More: Facebook, Instagram were down most of Wednesday around the globe for many users
More: Facebook, Instagram, WhatsApp restored after one of longest outages in Facebook history
More: What to do when Facebook and Instagram go down? Head to Twitter to tweet #FacebookDown