Infrastructure

What runs on AWS? A field guide to the cascade

9 min read · Published Apr 26, 2026
Contents · 6 sections
  1. The us-east-1 problem
  2. What runs on AWS — the consumer-side view
  3. How to read an AWS outage
  4. What's not on AWS (a partial list)
  5. Why this matters for your status checking
  6. FAQ

On December 7, 2021, an AWS internal-network event in us-east-1 took down most of the consumer internet for about four hours. Disney+ stopped streaming. Slack stopped messaging. Coinbase stopped trading. Netflix stayed up because it had spent a decade engineering for exactly this. The point of an outage like that isn't that "AWS went down" — outages happen — it's that everyone went down together, and the average user had no idea those services shared a single Virginia data-center region.

This post is the dependency map. Skim the table, skip to "how to read an AWS outage" if there's one happening as you read, and bookmark /infra/aws — that's where we keep the live cascade view.

The us-east-1 problem

AWS has 30+ regions. One of them — us-east-1, in Northern Virginia — runs an outsized share of every service AWS hosts, for a few reasons that compound:

The third reason is the underrated one. A 2017 S3 outage took down sites that didn't think they used S3 — turned out their CI/CD pipelines or their image hosts did, and "we run in Ireland" wasn't enough.

What runs on AWS — the consumer-side view

Here's the slice that's relevant to "I just want to know if my Netflix is going to work tonight." We curate this list from public statements, official tech-stack pages, and AWS case studies — never from guesses. There are thousands of AWS-hosted services we don't track; the ones below are the ones that get search-volume during outages.

ServiceWhat's on AWSCascade impact
NetflixAlmost everything except CDN edges (Open Connect, their own boxes inside ISPs). Famously runs across multiple AWS regions; can survive single-region outages because they engineered for it.Streaming usually keeps working — you may see a slow homepage or buffering, but not a full outage.
Disney+ / HuluDisney Streaming runs primarily on AWS. Streamlining onto Disney's "BAM" tech (acquired from MLB) consolidated infra in us-east-1.Goes down with us-east-1. December 2021 took both offline for hours.
TwitchOwned by Amazon — runs entirely on AWS. Live ingest sits in regional clusters; chat in us-east-1.Chat fails first when us-east-1 hiccups, often before video does.
RedditWeb + API on AWS. Static assets via Fastly. Image hosting (i.redd.it) on AWS.Goes down with AWS. The mobile app falls back to a "we're having trouble" screen.
SlackAWS — Slack's status page mentions it directly. Multi-region but heavy in us-east-1.Connections drop, message delivery degrades. Reconnects often surge once AWS recovers and produce a thundering-herd lag of their own.
Coinbase / RobinhoodBoth on AWS. Trading + matching engines colocated for low latency.Trading halts during AWS incidents — the worst kind of outage to have during a market spike.
NotionAWS. Heavy on RDS / Aurora.Read-only mode kicks in first; if the database write tier is degraded, every keystroke fails.
Airbnb / DoorDash / LyftAll AWS. Travel + delivery + rideshare routing run on EC2 + Lambda.The visible failure is "I can't book / order / call a car"; the cause is one tier deeper.
PinterestMostly AWS for serving + storage. Some workloads on GCP.Image-loading degrades first; full app outage during severe AWS events.

Live status of every service in this table sits at /infra/aws. If you landed here during a real AWS outage, that's the page you actually want — it sorts down/degraded services to the top so you can see the cascade in real time.

How to read an AWS outage

The next AWS incident will happen. When it does, the same five questions answer most of "is my service affected?":

1. Is it us-east-1, or somewhere else?

Open health.aws.amazon.com. The Status Dashboard lists per-region per-service incidents. If the colored markers are clustered in us-east-1, you're seeing the canonical cascade. If they're in Tokyo or São Paulo, the consumer-services impact is much narrower — most US/EU services don't run primary infra there.

2. Is the dashboard itself slow to load?

It runs on AWS. During severe events, the status page itself updates with delays — sometimes hours. AWS publishes a "Service Health Dashboard" RSS feed at the same URL that's slightly faster to update because it bypasses the rendered page. The 2021 outage notoriously delayed the dashboard's own updates by ~90 minutes; it's a recurring pattern.

3. Which AWS service is the proximate cause?

The cascades have fingerprints:

4. Are all of those listed services down for me?

If yes, it's a region-wide event and you can stop testing your own connection. If only some, the issue is partial — different services use different AWS subsystems and some of those are still healthy. Our /infra/aws page sorts down-first, so the cascade map is right there.

5. What can I do?

If you're an end user: nothing. AWS outages don't have a client-side fix. Local DNS flushing won't help; switching to mobile data won't help. Wait for AWS to recover, and ignore Twitter/X complaints that say "Reddit/Slack/etc. is down" because those are downstream of the actual problem and your tweet isn't speeding it up.

If you're an operator running on AWS: your runbook should already include "if us-east-1 is degraded, fail over to us-west-2 / eu-west-1." If it doesn't, the December 2021 retro is mandatory reading.

What's not on AWS (a partial list)

Knowing what isn't on AWS is sometimes more useful than knowing what is. Specifically these don't share AWS's blast radius:

Why this matters for your status checking

Most "is X down?" tools answer the wrong question. They tell you that Reddit specifically is down, and the user is left to figure out whether to wait, complain to Reddit, complain to their ISP, or restart their phone. The right question — when several big services are affected at once — is "is this a single upstream provider, and if so, which one?"

That's why /infra exists on isitdown.io. We tag every catalog service with the cloud / CDN it publicly runs on, so when a cascade is in progress you can see it as one event instead of 13 separate ones. Tagged conservatively from public statements only, because being wrong during a real outage is worse than being incomplete.

FAQ

Is AWS more or less reliable than competitors?

It depends on what you measure. AWS has more major-region-wide outages per year than GCP — partly because it has more regions and bigger blast radius per region; partly because us-east-1 is genuinely overloaded with control-plane responsibility. But AWS publishes detailed post-mortems and has the longest track record of any cloud, which makes the failures more visible.

Can I check if a specific site uses AWS without insider info?

Sometimes. dig +short site.com + an IP-range lookup against AWS's published ip-ranges.json will catch direct EC2 and ELB usage. Cloudflare-fronted sites hide their origin so the answer often reads "we don't know" — and that's correct, not a tool failure.

Why don't you list every AWS-hosted service?

Because we'd be wrong about most of them. Public companies disclose their cloud provider in earnings calls, status pages, and conference talks; private startups usually don't. The 13 services we currently tag (Netflix, Reddit, Slack, Disney+, etc.) are all from public sources. We'd rather show 13 we're certain of than 200 half-guessed.

Share 𝕏 Twitter LinkedIn
Keep reading

← All notes & guides