Post

Caching (Why Some Requests Never Reach Your Servers at All)

A first-principles explanation of caching, why it works, where it lives, and why it quietly powers fast systems.

Caching (Why Some Requests Never Reach Your Servers at All)

A Strange but Familiar Experience

You open a website.

The first load is slow.
You refresh.

Suddenly, it’s instant.

Nothing changed. No new server was added. No code was deployed.

So why did it become fast?

Because the second time,
the system didn’t work at all.

It remembered.

The Core Idea (Without Jargon)

Caching is simple:

If the answer is already known, don’t recompute it.

Instead of:

  • recalculating
  • hitting databases
  • calling downstream services

The system responds immediately.

Caching is not about speed alone.
It’s about avoiding unnecessary work.

A Simple Story: Asking for Directions

You ask someone for directions.

They explain it carefully.

Five minutes later, you ask again.

They don’t rethink the route. They just repeat the answer.

That repetition is caching.

Where Caching Lives (Not Just One Place)

Caching isn’t a single component.
It’s a behavior that appears at multiple layers.

  • Browser cache
    Images, scripts, pages, API responses

  • CDN cache
    Content served close to users

  • Reverse proxy cache
    Frequently requested responses

  • Application cache
    Computed results kept in memory

  • Database cache
    Query results, indexes, buffers

Most requests are answered
before they reach your core logic.

And that’s intentional.

What a Cached Flow Looks Like

flowchart LR
    User --> Cache
    Cache -->|Miss| Server
    Server --> Cache
    Cache -->|Hit| User
  • Cache miss → real work happens
  • Cache hit → instant response

Good systems aim for more hits than misses.

Why Caching Changes Everything

Without caching:

  • servers work for every request
  • databases get hammered
  • latency stacks up

With caching:

  • systems feel fast
  • load drops dramatically
  • failures hurt less

Caching doesn’t just improve performance.
It buys breathing room.

⚠️ Common Trap

Trap: Treating caching as a free performance win.

Caching introduces:

  • stale data
  • consistency problems
  • invalidation complexity

This leads to the classic saying:

“There are only two hard things in computer science:
cache invalidation and naming things.”

Caching shifts complexity — it doesn’t remove it.

A Real Failure You’ve Seen

Many large outages weren’t caused by traffic spikes.

They were caused by:

  • bad cache keys
  • missing invalidation
  • stale data being served globally

Users didn’t see errors. They saw wrong information.

Caching failures are subtle — and dangerous.

How This Connects to What We’ve Learned

Caching works because the system chooses not to work.

🧪 Mini Exercise

Pick an API or page you know.

  1. What parts of the response are safe to cache?
  2. What parts must always be fresh?
  3. What happens if cached data is wrong?

If you can’t answer these, caching will eventually hurt you.

What’s Coming Next

Caching introduces a dangerous question:

What happens when cached data becomes wrong?

Next: Cache Invalidation
Why making things fast is easy — keeping them correct is hard.

This post is licensed under CC BY 4.0 by the author.