Cloudflare Outage

Cloudflare Outage: What Caused the Widespread Disruption?

On June 12, 2025, Cloudflare had a big outage. This outage stopped many important services, including the affected services that people around the world use. The disruption came from a storage problem in the Workers KV system. This failure reached out across the Cloudflare system and even affected other places, like Google Cloud. The service disruption lasted almost two and a half hours. Cloudflare acted fast to fix the problem and started working to make its setup better. Let’s look at what happened, how it hit people, and what caused this outage with the workers’ KV system and KV storage.

Key Events and Impact of the Cloudflare Outage Affected Even Google

The Cloudflare outage had a big impact on how services worked. Things that usually run without any problems were not working right. At 17:52 UTC, Cloudflare engineers saw that new devices were not able to register in WARP. Soon, this turned into a much bigger service disruption. It spread to important areas like Access, Gateway, and Workers KV. This service disruption made it hard for asset delivery and authentication to work. Because of this, people could not log in or move files as they normally would.

The problems did not stay only inside Cloudflare’s systems. The effects spread to other major global services like Google Cloud. This made the outage affect even more places. These issues showed how all our cloud technologies are connected and how one problem can quickly cause others.

Which major services and websites were affected?

Cloudflare’s outage impacted a broad array of services, including Workers AI, Durable Objects, and WARP, halting key functionalities. With its Workers KV service reporting over 90% failure rates, authentication requests, asset delivery, and Workers Assets real-time operations faced critical challenges. Turnstile, the CAPTCHA verification tool, and Gateway were also non-operational during the incident.

90.22% failure rate on reads/writes; caused real-time turn cascading failures across dependent services.

Service

Impact Details

Workers KV

90.22% failure rate on reads/writes; caused real-time turn cascading failures across dependent services.

Gateway

DNS over HTTPS (DoH) queries were impacted as DoH queries with identity-based rules failed; authenticated sessions were disrupted.

Turnstile

CAPTCHA failures; token reuse risks are introduced due to kill switch activation.

Browser Isolation

Failed to initiate secure web sessions; impacted businesses relying on policy enforcement tools.

Stream Services

Live streaming stopped; video playlist serving failed with ~90% error rate.

Dashboard Access

User login sessions failed due to dependency outages in Turnstile and Access authentication modules.

This failure list underscores the interconnected dependency of their products on their Workers KV service.


How did the outage disrupt users and businesses in the United States?

Users and businesses in the U.S. had to deal with big problems during the Cloudflare outage. The issue stopped key things from working right, like API tasks and asset delivery. Authentication pipelines for Access and Gateway showed errors. This made it hard for users to log in or prove their identity. It also made the KV service slow or not work, and that hurt browser isolation tools.

Many businesses use Cloudflare’s AI tools. The outage made Workers AI and AutoRAG services stop working. This broke machine learning routines and document indexing. So, AI workflows around the world could not run.

The disruption hit consumer-facing services, too. Streaming platforms could not deliver videos well, and video playback dropped a lot. Queues for images also failed to upload, with the failure rate almost at 100%. As more people had trouble logging in and as secure API connections got worse, businesses saw their work go down during this outage.

Root Causes Behind the Cloudflare Outage

The cause of this outage was a big problem in Cloudflare’s Workers KV storage infrastructure. This system used a third-party provider for part of its services. When that provider had an outage at the same time, the KV storage for static assets stopped working correctly. This issue in the central store made error rates go up, and cold reads and writes could not work as they should.

Cloudflare said that this reliance on another company showed some weak points in the way they set up their storage. Now, they want to stop depending on outside services. They plan to move Workers KV to Cloudflare R2 object storage, so it will be better prepared for the future and keep things running, even if one part fails.

Was the incident linked to a security breach or technical failure?

Cloudflare said that the outage today was not because of a security problem. All data stayed safe and was not changed during this time. The root cause of the issue came from a technical problem in part of this infrastructure, specifically impacting the availability of our KV service and latency of our KV service, related to the underlying storage infrastructure used for Workers KV.

This problem got worse because Cloudflare’s storage infrastructure relied on a third-party cloud provider. This provider had its issues at the same time. The outage spread to other services that depended on this storage and made things even worse. There were no backup options, so errors spread more during the storage failure.

There was no attack, but the outage on Thursday showed some weak spots in the storage infrastructure that Cloudflare wants to fix. To handle this, Cloudflare is quickly moving storage for Workers KV over to its own R2 object storage. This will help make storage more reliable and less risky for Cloudflare in the future.

What steps did Cloudflare take to identify and resolve the issue?

To solve the outage, Cloudflare acted right away with a set of steps. The team marked the blackout as a top concern (P1) at 18:05 UTC. When they saw how much trouble it caused, they made it the highest level (P0). The engineers found that the main problem was with Workers KV and its storage infrastructure. After that, they started to look at other ways to store data.

To move faster, Cloudflare turned on kill switches, stopping non-important services but keeping traffic flow safe. People working on Gateway, Access, WARP, and Dashboard put plans into action to keep the systems running as best they could, even if they were only working in a limited way. The company began to fix some of the issues by 19:12 UTC.

As the team moved forward with the repairs, Cloudflare ensured that the system was not overloaded. They did this by slowly filling system caches, so they could avoid causing more problems. Cloudflare’s teams made changes in real-time using their operational dashboard. This allowed more people to see what was going on. Later, users were able to get back on the platform smoothly when the outage was over.

Conclusion

The recent Cloudflare outage shows just how connected our online world is. It also shows how a problem with one service can cause a big disruption. Many major services were hit hard, and that left many users upset. To keep up in this digital world, both businesses and people need to know why these problems happen. Cloudflare worked quickly to figure out and fix the issues. Still, this outage points out why it is so important for everyone to have backup plans.

If you assess risks and act before things go wrong, you can lessen the impact of an outage. It is wise to stay alert and ready so you can deal well with any future disruption that may hit our online reliance. Get more tips and ideas for dealing with service disruptions—reach out for a consultation if you want to know more.

Frequently Asked Questions

Why do Cloudflare outages have such a widespread effect on the internet?

Cloudflare is a critical dependency for many internet applications. It acts as a gateway and works as a CDN for asset delivery. When they have an outage, it affects services that count on their authentication and configuration. This can cause other websites and systems across the world to fail.

How long did the June 2025 Cloudflare outage last in UTC?

The Cloudflare outage on June 12, 2025, went on for 2 hours and 28 minutes. It started at 17:52 UTC and ended at 20:28 UTC. Some services from Cloudflare and Google started to come back before things were fully fixed. This is what both Cloudflare and Google said in their reports about the outage.

What can businesses do to mitigate risks during future outages?

Businesses can set up redundancy plans and put money into strong, resilient infrastructure. They can also set up systems so they work in a better way, even when some parts stop working. Adding backup plans and alerts helps you bypass problems. This can keep things running for your company when main services, like Cloudflare, have trouble.

Has Cloudflare implemented changes to prevent similar disruptions?

Yes, Cloudflare has moved fast to stop depending on other storage companies. They did this by moving Workers KV to their own R2 object storage. Cloudflare also put new safety steps and tools in place. These help bring back what was lost, bit by bit, if there is an issue. This lowers the harm that problems can cause across their services. This change helps lower reliance on outside storage.

Does the U.S. government rely on Cloudflare’s services?

Yes, there are some government groups that use Cloudflare tools like Web Application Firewall (WAF), authentication systems, and Zero Trust services. These help keep data safe and protected. They are very important for keeping government websites and systems up and running well.

TUNE IN
TECHTALK DETROIT