This is quite a blatant assumption on its right own (and very far from the truth). The programming, itself, has not changed. But of course, modern hardware is not a von neumann machine. Writing lock-free datastructure is not that different programming, it requires a lot more attention and (possibly) experience but the basic premise is still the same.
Understanding memory topology/hierarchy & latency, concurrency, branch (mis)prediction, cache coherency should be a minimum for anyone who comments on CPU architecture. I did mention Assembly and without some knowledge on the target architecture it's rather pointless to comment on, either.
I encourage most developers to at least understand that memory is not actually 'random access', which makes derefernce not cheap - but accessing data placed together is next to free as it is likely to hit L1.
> discourage assembly and writing for specific CPU architecture
I found out that I could not reliably beat a standard compiler writing everyday Assembly around K6-2 years. Yet, still some inner loops can be carefully hand optimized. The point is that there are plenty of programmers who would be able to understand modern architecture and to me basic understanding is needed unless the job is just gluing code.
This is quite a blatant assumption on its right own (and very far from the truth). The programming, itself, has not changed. But of course, modern hardware is not a von neumann machine. Writing lock-free datastructure is not that different programming, it requires a lot more attention and (possibly) experience but the basic premise is still the same.
Understanding memory topology/hierarchy & latency, concurrency, branch (mis)prediction, cache coherency should be a minimum for anyone who comments on CPU architecture. I did mention Assembly and without some knowledge on the target architecture it's rather pointless to comment on, either.
I encourage most developers to at least understand that memory is not actually 'random access', which makes derefernce not cheap - but accessing data placed together is next to free as it is likely to hit L1.
> discourage assembly and writing for specific CPU architecture
I found out that I could not reliably beat a standard compiler writing everyday Assembly around K6-2 years. Yet, still some inner loops can be carefully hand optimized. The point is that there are plenty of programmers who would be able to understand modern architecture and to me basic understanding is needed unless the job is just gluing code.