Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
TensorRT-LLM runtime now open-source (github.com/nvidia)
4 points by mmoskal on March 11, 2025 | hide | past | favorite | 1 comment


Previously, the "Executor" runtime was shipped as binary blobs. This is the bit that schedules requests and manages KV cache (similar to vLLM or SGLang server).




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: