Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This was a really interesting paper but there's a massive gap in what they didn't try, which is inference-time temperature changes based on the fork/lock distinction.

Maybe I'll try that myself, because it feels like it could be a great source of improvements. It would be really useful to see adaptive per-token sampling as an additional decode-only baseline.



Is this some kind of calibration then? I'd expect that the probabilities automatically adjust during training, such that in "lock" mode, for example, syntax-breaking tokens have a very low probability and would not be picked even wich higher temperature.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: