in fact I am betting opposite. frontier models are getting not THAT much better ...

zozbot234 · 2026-04-17T12:18:49 1776428329

Local open inference can address hardware scarcity by repurposing the existing hardware that users need anyway for their other purposes. But since that hardware is a lot weaker than a proper datacenter setup, it will mostly be useful for running non-time-critical inference as a batch task.

Many users will also seek to go local as insurance against rug pulls from the proprietary models side (We're not quite sure if the third-party inference market will grow enough to provide robust competition), but ultimately if you want to make good utilization of your hardware as a single user you'll also be pushed towards mostly running long batch tasks, not realtime chat (except tiny models) or human-assisted coding.