2.1.108
Added ENABLE_PROMPT_CACHING_1H env var to opt into 1-hour prompt cache TTL on API key, Bedrock, Vertex, and Foundry (ENABLE_PROMPT_CACHING_1H_BEDROCK is deprecated but still honored),
and FORCE_PROMPT_CACHING_5M to force 5-minute TTL
docs are not updated yet, directly from the changelog^
And this is "a bit better" - but seemingly still nowhere close to what subscribers get where main thread, agent, initial and follow-up messages may all get there own ?intelligent? 5min or 1h decision :/
And this is "a bit better" - but seemingly still nowhere close to what subscribers get where main thread, agent, initial and follow-up messages may all get there own ?intelligent? 5min or 1h decision :/