More

tandr · 2026-04-22T19:46:22 1776887182

by the looks of it... any front collision == instant death?

monooso · 2026-04-22T20:20:31 1776889231

I know nothing about automobile design, but the Smart Fortwo [1] seemed to solve this problem just fine (IIRC they had a very good NCAP safety rating).

https://en.wikipedia.org/wiki/Smart_Fortwo

EvanAnderson · 2026-04-22T21:33:00 1776893580

Always a good time to share this video re: crashing a Smart Fortwo: https://youtube.com/watch?v=mnI-LiKCtuE

rootusrootus · 2026-04-22T22:49:22 1776898162

> IIRC they had a very good NCAP safety rating

3 out of 5, which I think merely qualifies it as "average"

evilos · 2026-04-23T00:40:30 1776904830

They very much designed for collisions. They have an engineer discussing those aspects this video.

https://m.youtube.com/watch?v=Tv5QwgQUMGY

tandr · 2026-04-28T17:00:34 1777395634

thank you! That was quite educational.

OkayPhysicist · 2026-04-22T23:22:29 1776900149

It's an EV, so what little nose it has is probably all crumple zone (as opposed to having a big ol' engine in the way. Popping the hood on most EVs is pretty funny, actually, because of how little there is under there.

tandr · 2026-04-22T16:18:46 1776874726

What would be these additional vllm flags, if you don't mind sharing?

proxysna · 2026-04-22T21:52:02 1776894722

This is from an example from my Nomad cluster with two a5000's, which is a bit different what i have at work, but it will mostly apply to most modern 24G vram nvidia gpu.

"--tensor-parallel-size", "2" - spread the LLM weights over 2 GPU's available

"--max-model-len", "90000" - I've capped context window from ~256k to 90k. It allows us to have more concurrency and for our use cases it is enough.

"--kv-cache-dtype", "fp8_e4m3", - On an L4 cuts KV cache size in half without a noticeable drop in quality, does not work on a5000, as it has no support for native FP8. Use "auto" to see what works for your gpu or try "tq3" once vllm people merge into the nightly.

"--enable-prefix-caching" - Improves time to first output.

"--speculative-config", "{\"method\":\"qwen3_next_mtp\",\"num_speculative_tokens\":2}", - Speculative mutli-token prediction. Qwen3.5 specific feature. In some cases provides a speedup of up to 40%.

"--language-model-only" - does not load vision encoder. Since we are using just the LLM part of the model. Frees up some VRAM.

czl · 2026-04-23T05:29:28 1776922168

> "--speculative-config",

Regarding that last option: speculation helps max concurrency when it replaces many memory-expensive serial decode rounds with fewer verifier rounds, and the proposer is cheap enough. It hurts when you are already compute-saturated or the acceptance rate is too low. Good idea to benchmark a workload with and without speculative decoding.

tandr · 2026-04-27T21:35:26 1777325726

Thank you!

tandr · 2026-03-29T17:12:56 1774804376

I think they are talking about AWS dashboard, but I might be wrong.

tandr · 2026-03-23T17:59:26 1774288766

I understand that the project is in his early stages, and documentation (README) is very much WIP. So, couple questions

    1. "SSD backed" - on-the-fly 100% time (like KVRocks), or requires same "save" like Redis does?
    2. (if it is like kvrocks) Do you have any perf numbers for DB sizes that are, say, 10x of RAM size?

tandr · 2026-03-15T20:07:47 1773605267

Some simpler benchmark table would be great. May I suggest Ollama on base machine, Ollama with T1, Ollama with T1+T2 etc. on midsize and big models to compare token/sec?

tandr · 2026-02-19T18:28:21 1771525701

Niklaus Wirth died in 2024, and yet I hope he is having a major I-told-you-so moment about people blaming Pascal's bounds checking to be unneeded and making things slow.

nas · 2026-02-19T23:55:45 1771545345

My CS college used Turbo Pascal as a teaching language. I had a professor who told us "don't turn the range and overflow checking off, even when compiling for production". That turned out to be very wise advice, IMHO. Too bad C and C++ compiler/language designers never got that message. So much wasted to save that less than 1% performance gain.

nmz · 2026-02-19T19:24:38 1771529078

To this day, FPC uses less ram than any C compiler, A good thing in today's increasingly ramless world and they've managed this with way less developers working on it than its C compiler equivalent, I can't even imagine what it would look like if they had the same amount of people working on it. C optimization tricks are hacks, the fact godbolt exists is proof that C is not meant to be optimizable at all, it is brute force witchcraft.

At a certain point though, something's gotta give, the compiler can do guesswork, but it should do no more, if you have to add more metadata then so be it it's certainly less tedious than putting pragmas and _____ everywhere, some C code just looks like the writings of an insane person.

inkyoto · 2026-02-21T01:25:34 1771637134

> […] C optimization tricks are hacks, the fact godbolt exists is proof that C is not meant to be optimizable at all, it is brute force witchcraft.

> At a certain point though, something's gotta give, the compiler can do guesswork, but it should do no more, if you have to add more metadata then so be it it's certainly less tedious than putting pragmas and _____ everywhere, some C code just looks like the writings of an insane person.

There is not even a single correct or factual statement in cited strings of words.

C optimisation is not «hacks» or «witchcraft»; it is built on decades of academic work and formal program analysis: optimisers use data-flow analysis over lattices and fixed points (abstract interpretation) and disciplined intermediate representations such as SSA, and there is academic work on proving that these transformations preserve semantics.

Modern C is also deliberately designed to permit optimisation under the as-if rule, with UB (undefined behaviour) and aliasing rules providing semantic latitude for aggressive transformations. The flip side is non-negotiable: compilers can't «guess» facts they can't prove, and many of the most valuable optimisations require guarantees about aliasing, alignment, loop independence, value ranges, and absence of UB that are often not derivable from arbitrary pointer-heavy C, especially under separate compilation.

That is why constructs such as «restrict», attributes and pragmas exist: they are not insanity, they are explicit semantic promises or cost-model steering that supply information the compiler otherwise must conservatively assume away.

«metadata instead» is the same trade-off in a different wrapper, unless you either trust it (changing the contract) or verify it (reintroducing the hard analysis problem).

Godbolt exists because these optimisations are systematic and comparable, not because optimisation is impossible.

Also, directives are not new, C-specific embarrassment: ALGOL-68 had «pragmats» (the direct ancestor of today’s «pragma» terminology), and PL/I had longstanding in-source compiler control directives, so this mechanism is decades older than and predates modern C tooling.

diath · 2026-02-19T21:45:44 1771537544

There's a blog post from Google about this topic as well where they found that inserting bound checking into standard library functions (in this case C++) had a mere 0.3% negative performance impact on their services: https://security.googleblog.com/2024/11/retrofitting-spatial...

For people using Clang you can read more about libc++ hardening at https://libcxx.llvm.org/Hardening.html

musicale · 2026-02-21T04:31:35 1771648295

Bounded strings turned out to be a fairly good idea as well.

tandr · 2025-12-30T20:31:22 1767126682

It works, but the best in me I cannot explain fully first 3 symbols. /*?sr/bin/env finds /usr by expanding *? to a first matching directory. But why not just /*usr/ instead?

fsmv · 2026-01-02T22:48:52 1767394132

I think I was just trying to minimize accidentally matching the wrong thing. Both do work though and it is kinda nice to be more readable.

If I remember right I think ? Is exactly one character only, or maybe it does non greedy .

tandr · 2025-12-21T03:30:22 1766287822

Index Size

    Biscuit 277.09 MB
    Trigram 86 MB
    B-Tree  43 MB

Pretty much you exchange space for speed

tandr · 2025-12-21T03:13:59 1766286839

Spotify just (last week or 2 weeks ago) introduced lossless compression (FLAC) and it sounds amazing.

cedws · 2025-12-21T13:40:39 1766324439

Wow didn't know about that, thanks.

tandr · 2025-10-28T20:54:20 1761684860

Last decades? wipes the tear You surely forgot /s at end, I hope. The evil incarnation what is called "Samsung fridge" that I have in my kitchen required repairman's attention just 3 months after the purchase. And then every 3 months after. And children sacrifices, sorry - steam baths, for the ice maker every month or so.

Samsung appliances - never again.

PS. Repairman told me that Samsung have fixed already one of the problems my fridge has by the time he looked at it, kind of hidden recall and fix. Fridge's version (yes, they have versions) have advanced like 7 iterations already from the time I bought it. That means there were at least 7 serious design/manufacturing problems that they had to fix.

hn_acc1 · 2025-10-28T21:17:25 1761686245

I mean.. that's based on the assumption that they actually care about delivering a working appliance.. As long as the spyware works, they don't really care about the "cooling food" part..