A lot of inference providers for open models only accept prepaid payments, and m...

rvnx · 2026-04-09T16:09:04 1775750944

The main thing about Openrouter is also that they take 100% of the risk in case of overcharges from the models, you have an actual hard cap.

The minus is that context caching is only moderately working at best, rendering all savings nearly useless.

largbae · 2026-04-09T23:04:31 1775775871

I haven't noticed any problems with large context requests through OR to e.g. Opus (other than the rate at which my budget gets spent!). Is this a performance thing?

SR2Z · 2026-04-09T17:55:33 1775757333

Is there any risk? Don't the model providers also bill by the token?

fuzzy2 · 2026-04-09T18:11:49 1775758309

The accounting could be asynchronous, so you could overshoot your budget by a few requests before you're blocked.