Anthropic prompt cache has a 5-minute TTL by default
Anthropic’s prompt cache writes a cache_control block and the cache stays warm for five minutes of idle time before the entry is evicted. If you sleep longer than five minutes between calls — or your loop polls slower than that — every wake-up is a cache miss and you pay the full re-read cost on the next request.
This matters when picking polling intervals. The naive “wait five minutes” choice is the worst of both: you pay the miss without amortizing it. The two stable choices are:
- Under five minutes (60s–270s). Cache stays warm; right for tight feedback loops.
- Past five minutes (1200s+). Pay one cache miss, but a much longer wait amortizes it.
Round numbers like 5m feel natural and are exactly wrong. Think in cache windows, not in minutes.