It seems a lot of the problem isn't "token shrinkage" (reducing plan limits), bu...

mnicky · 2026-04-16T20:12:37 1776370357

> things that used to be cached for 1 hour now only being cached for 5 min.

Doesn't this only apply to subagents, which don't have much long-time context anyway?

HarHarVeryFunny · 2026-04-17T11:54:55 1776426895

AFAIK the way caching works is at API key level, which will be shared across the main/parent agent and all subagents.

Note that the model API is stateless - there is no connection being held open for the lifetime of any agent/subagent, so the model has no idea how long any client-side entity is running for. All the model sees over time is a bunch of requests (coming from mixture of parent and subagents) all using the same API key, and therefore eligible to use any of the cached prompt prefixes being maintained for that API key.

Things like subagent tool registration are going to remain the same across all invocations of the subagent, so those would come from cache as long as the cache TTL is long enough.