Anthropic prompt cache has a 5-minute TTL by default

Anthropic’s prompt cache writes a cache_control block and the cache stays warm for five minutes of idle time before the entry is evicted. If you sleep longer than five minutes between calls — or your loop polls slower than that — every wake-up is a cache miss and you pay the full re-read cost on the next request.

This matters when picking polling intervals. The naive “wait five minutes” choice is the worst of both: you pay the miss without amortizing it. The two stable choices are:

  • Under five minutes (60s–270s). Cache stays warm; right for tight feedback loops.
  • Past five minutes (1200s+). Pay one cache miss, but a much longer wait amortizes it.

Round numbers like 5m feel natural and are exactly wrong. Think in cache windows, not in minutes.