Several websites and mobile applications were taken down Tuesday due to a setup mistake in the systems of a content delivery service provider.
Approximately 5:49 a.m. Eastern time in the United States, Fastly, a provider that serves companies such as CNN, The Guardian, the New York Times, Hulu, Reddit, HBO Max and Spotify, suffered an outage. The service returned to normal at 6:39 a.m. Eastern time.
According to National Public Radio, users to CNN.com who attempted to access the website during the outage were greeted with the message “Fastly error: unknown domain: cnn.com.” An “Error 503 Service Unavailable” message flashed on the websites of the New York Times and the United Kingdom’s government, accompanied with the phrase “Varnish cache server.” Varnish is a technology that Fastly makes use of.
An official from Fastly replied to TechNewsWorld’s inquiry regarding the outage by issuing the following statement: “In our worldwide network, all Fastly cache nodes have now been restored to their previous status. We discovered a service configuration that was causing interruptions across all of our points of presence across the world, and we deactivated that configuration immediately.”
Content Distribution Networks (CDNs)
Fastly is what is referred to as a content delivery network in the industry. CDNs have been in existence for more than two decades, but they have developed and grown in scope over that period.
Observed Doug Madory, director of internet analysis at Kentik, a network observability firm based in San Francisco, “the vast majority of information on the internet that consumers engage with is delivered to them via content delivery networks.”
In his interview with TechNewsWorld, he said that there has been “some concentration in the business,” and that when there is an outage, “it may knock down a lot of things.”
Andy Champagne, senior vice president in the office of the chief technology officer of Akamai, a content delivery and cloud security company based in Cambridge, Massachusetts, stated that content providers would be unable to literally pump out material from a single place.
In an interview with TechNewsWorld, he said that “you can’t create a place large enough, linked enough, and near enough to everything.” “That is why we have about 300,000 servers across the globe that are dedicated to content distribution.”
In addition, “everyone who is a major brand nowadays, and even lesser companies, are utilizing content delivery networks to disseminate their material,” he said. “
As he said, “one of the difficulties of using the internet is that its size may take you unaware guard.” “Something may suddenly become very popular without warning. People may find themselves wanting to download it, listen to it, play it, watch it, or purchase it all of a sudden. That’s where content delivery networks (CDNs) may truly assist. They have the ability to scale up immediately.”
Bringing Latency down
The content delivery network (CDN), according to Jonathan Tanner, a senior security researcher at Barracuda Networks, a security and storage solutions provider based in Campbell, California, hosts frequently-loaded content, such as images for other websites or entire websites, in a distributed manner to allow for faster load times.
As he explained to TechNewsWorld, “basically, they would host the same material in different data centers around the globe, and when a user goes to a website which loads content from the CDN, they will load that content from the data center that is nearest to that user.”
“This relieves the CDN customer’s bandwidth load by avoiding the loading of large files from the CDN customer’s own servers, and it also enables lower latency for the users by serving content from a geographically closer location to the user than where the CDN customer’s website is being hosted,” he explained.
To accomplish the same result, “the CDN client could host copies of their whole site in different data centers,” he said, “but doing so would involve a significant amount of expense in comparison to just engaging a firm like Fastly that does this on a large scale.”
Disaster Becomes More Extensive
The CDN service configuration that caused the outage at Fastly has not been made public at this time. However, CDNs may have a lot of moving components, and their systems are continuously being changed.
In order to ensure that an update would not create a problem, “a provider often evaluates the changes in phases,” Madory said. For the purpose of convenience, they may make modifications on the fly that are not subjected to the same rigorous testing as before.
According to Tanner, a poor setup may cause the program to crash completely, or it can prevent the software from functioning correctly by preventing it from accessing required resources. Either of these outcomes would result in an outage.
“By the very nature of how content delivery networks (CDNs) operate, the same code and material is being housed in many different data centers across the globe,” he said. In that case, if a poor configuration is deployed, it may potentially be spread across all of those data centers, resulting in an outage.
Because if one data center goes down, customers will be routed to the next-closest data center for material, he said that CDNs may be more robust to outages than other types of systems.
According to him, “a failure with the core software across all data centers would very certainly cause the whole service to fall down.”
Upgrade at a Slower Pace
Certainly, if there is anything to be learnt from the Fastly outage, it is that distributed networks continue to play a key part in the internet’s current state, and that it is essential to ensure that the software operating in distributed systems is functioning correctly.
As Tanner pointed out, the incident “hopefully highlighted a key lesson about how to properly manage upgrades in the future.” “That is, rather than targeting every data center at once, it is preferable to gradually roll out software and ensure that it is functioning correctly before implementing a significant change.”
“For CDNs or any other distributed architectures, ensuring that software and configuration upgrades are done in a gradual way, rather than to all data centers at the same time, would undoubtedly assist to avoid these types of disruptions in the future,” he said.
It would also be beneficial for those who use content delivery networks (CDNs) to have an action plan in place in the case of an outage in order to minimize downtime, he said.
Fastly isn’t the only company that has had a high-profile downtime.
Amazon Web Services suffered a cyberattack in October, causing its customers to be without access to vital information for more than ten hours at a time. AWS service was interrupted for clients on the United States East Coast in November due to a mishap on the AWS cloud computing platform in June. Cloudflare customers reported visitors having difficulty accessing their websites and services in July.