Complaining on HN about issues while being on a beta release of an OS instead of an actual bug tracker is funny. On my Android 16 the site works perfectly fine, on multiple browsers.
But if there aren't enough returns soon the money will eventually dry up for OAI and Anthropic and Google will not be trusted with their cash balance.
Its amazing how people here think that money is a play-thing and this dance can go on forever. It cant and wont and the fear-induced marketing doesnt work forever either.
This is a false equivalence. Models get better with more & better data.
Both more data and better data are very expensive. Procuring... Handling... All of the above...
You can spend bottomless piles of cash and by not doing the right things not get there. I can count on one hand the number of times I've seen business/investor incentives line up with r&d incentives.
There's no guarantee that there is enough or good-enough data, regardless of how much money you have.
Agreed. The confidence people have to predict what these tools will be capable of two years down the line, when it's barely been over a year since Claude Code was first released, is astounding.
Mhh... my hunch is that part of this is that all python keywords are 1 token, I assume. And for those very weird languages, tokenizing might make it harder to reason over those tokens.
Would love to see how the benchmarks results change if the esoteric languages are changed a bit to make them have 1-token keywords only.
Reasoning is hard, reasoning about colors while wearing glasses that obfuscate the real colors... even harder... but not the core issue if your brain not wired correctly to reason.
I suspect the way out of this is to separate knowledge from reason: to train reasoning with zero knowledge and zero language... and then to train language on top of a pre-trained-for-reasoning model.
LLMs already use mixture of experts models, if you ensure the neurons are all glued together then (i think) you train language and reason simultaneously
It's always made as much sense to me as being up or down money in Monopoly, or points in basketball. Stating the W/L value of a position feels like an weird mixing of the present and future to me. Of course the centipawn value holds an implicit prediction of the future, but the indirection makes it more palatable.
I learned chess when I was 5, and didnt have a chess computer in the first like 5 years and by then I have progressed quite far.. so i cannot really tell
Makes sense. I started learning how to play Chess when I was ~30 and my tutors were just chess engines, game reviews on chess.com and whatever books I found interesting enough to get through. I have fun, and that's all I'll ever have, no titles or anything. The centipawn stuff makes sense now, but it took a while.
Interesting. Spotify works almost perfectly for my discovery needs. I just pick a track I know that fits my mood, then use the (3-dot menu) "Go to Radio" option, which leads to a playlist that usually includes tracks and/or artists new to me. It's been a reliable discovery mechanism for me for many years. Also, there's a new feature I first saw within the last week, a "non-personalized" option that increases the "new to me" ratio.
the "you might also like" for a given artist is usually the most generic related artists - for anything remotely related you'll get basically the same list which is the middle of the venn diagram of everyone who listens to them
I always find this interesting… Spotify is phenomenal for me - about every third Monday Discovery playlist has two or three hits, which feels like a pretty solid ratio, at this point. YouTube has never suggested a single thing I cared for.
I wonder if it’s a curation thing? I’ve been with Spotify since the first day it was available, and rarely use YouTube. I haven’t had a good music ratio as good since newsgroups and (real) forums a decade ago, which were a different form of curation.
You seem knowledgeable about this.. Care to test my old project for music recommendation? I built it by looking at co-occurrence of artists in Spotify playlists, which gives me word2vec-style vectors, and then its just kNN.
No login needed, just enter some artist names and see what you get:
Very interesting, I've been working on a similar project (using word2vec to learn vectors using playlist data), but using songs instead of artists as the 'words'.
The main bottleneck at this point is the volume of data - many songs I'm interested in only are only represented in a handful of playlists, and . Evaluation at any useful scale is also quite difficult. For somewhat obvious reasons, in our AI era Spotify has become quite skittish about letting third parties gain access to their data at scale...
This is pretty neat, shows good relationships especially on the edgecases where an artist has a very unique sound that other artists dont mimic, but otherwise people who typically like that artist will like others.
Would be very cool if it supported smaller artists than it currently does, because imo thats how you start surfacing emerging talent.
As in, it doesnt work at all and just shows an error on load.