Hacker Newsnew | past | comments | ask | show | jobs | submit | mxwsn's commentslogin

> Here’s a thought experiment: suppose that a mathematician solved a major problem by having a long exchange with an LLM in which the mathematician played a useful guiding role but the LLM did all the technical work and had the main ideas. Would we regard that as a major achievement of the mathematician? I don’t think we would.

This is a cultural choice. It makes sense that in the mathematics culture we currently have, this is alien. But already, other fields, and many individuals, would disagree and say that the human did have a major achievement here. As long as human-AI collaborations are producing the best results, there is meaningful contribution by the humans, and people that are deeper experts and skilled LLM whisperers should be able to make outsized contributions. The real shoe drops when pure AI beats humans and human-AI collaboration.


I replied to a comment about AI in sports and I build on that.

We praise car drivers despite most of the performance in their sport comes from the car. The driver makes the difference when two cars are close in performance. Brilliances or mistakes. Horse riders too.

In the case of math, the human can lead the LLM on the right track, point it to a problem or to another one. So it deserves some praise.

Then the team that built the car, cared about the horse, built the AI might deserve even more praise but we tend to care more about the single most visible human.


Could you win an F1 race with the latest winning car against F1 drivers?

I'm not sure I understood your question. Literally, of course not, but how does it relate to my points?

If I had a car 100 km/h faster on straights, after some training I would probably win Monza, but that would be a car that does not conform to F1 rules (or we would have that kind of speeds now) so that would not be a F1 race.

Maybe your question is about the sharing of praise between the team and the driver. I think that every race fan agrees that when a team did a much better job than all the other ones and have a dominant car, the championship is a competition between the two drivers of that team. So the car is the single most important factor. Then the best driver wins. Nobody can overcome a one second difference in a season of 24 GPs.

But maybe you asked a different question.


It's more to say that F1 drivers are a selected niche that is very good at winning F1 races, representing maybe a 200 to 5000/8,300,000,000 group. I doubt you could win an F1 race at all, respecting the rules. Whether the team deserves praise or not, the drivers show exceptional aptitude to win.

If Terrence Tao finds a novel proof, I believe it's his exceptional aptitude that is to praise, whatever help he used.

Edit0:I would bet that a normal run of the mill random human would be likely to kill themselves racing (with actual intent) an F1 car.

Have a good one!


I’m not sure what your point is. I could certainly not, and I could certainly not write a breakthrough paper in mathematics even with the most advanced AI. I wouldn’t even know what to ask of it.

Perhaps I could set up an elaborate master agent to consider all possible new problems in mathematics and ask sub agents to work on the most promising ones. But then I could probably also program a self driving car system which could win an F1 race as well.


>Would we regard that as a major achievement of the mathematician? I don’t think we would

For some reason this reminds me of AI images and a domain like comedy.

If an image makes people laugh, the person who prompted it to make the image certainly doesn't get credit for the vast majority of the work in its creation, but perhaps they do get credit for the initial prompt idea and then the "taste" to select that particular one from whatever drafts they went through or otherwise guiding it.

So if a mathematician comes up with an amazing result that an LLM "did", I think they could still get a bit of credit for prompting it to do it and being its guide.

But whereas the first person could perhaps be called a comedian and not an artist, would the mathematician still be called a mathematician or something else?


I would. Even if someone found a prompt or even automated the conversation and just searched all open math problems I still would. If they produced a useful result without harm to anyone, that's a valuable human activity that should be rewarded just as well as we reward the other mathematicians, which I imagine is quite a lot, given all the billionaire mathematicians...

> given all the billionaire mathematicians

We just call those ones “quant traders”.


It may not be a major achievement by the mathematician (although it's debatable) but it would still be a major result.

Great summary. The fact that the auto encoding task is not grounded in thoughts, and their initial training on guessed internal thoughts, raise serious concerns on faithfulness. Feels like they might get better results by just training a supervised model on activations and "internal thoughts" measured by some different behavioral way.

Diffusion and flow matching models generate samples by iterative denoising. Iterative denoising means passing input to the neural network, running a forward pass, and taking the output back as input and rerunning the neural network. Often you do this 100 times, which is slow and expensive.

Flow maps / consistency models / shortcut models instead try to learn to compress this iterative work into 1 forward pass. This makes inference 100x faster as you'd only need to run the neural net forward pass once. Beyond speeding up inference, there are other advanced benefits to this, such as improved ability to perform inference-time steering.

Mathematically, learning a flow map corresponds to learning to solve an ordinary differential equation, i.e., learning the time integral of the velocity field. This mathematical foundation provides the basis for various training objectives for learning flow maps, which involve self-referential identities or identities such as the transport equation, which are discussed in the blog post.

Hope that helps! I'm an ML researcher currently researching flow maps.


Very helpful! Naïve question (I haven’t had a chance to read TFA at all and diffusion/flow models are not my area of expertise). Doesn’t learning the integral/solution of the diffusion process in a single pass just take us back to like OG generative CNN that we had before diffusion models took over? Surely the answer is “no” but would love to hear your framing as to why.

It kind of does! In the modern era of generative modelling, it seems like we rely on pre-training to capture the data distribution, and then on post-training (and various other tricks) to carve out a sliver of that distribution that we actually care about (i.e. what we want our model to generate).

To be able to specify that subset with relatively few examples, a good high-level understanding of the data distribution is necessary. The way I see this, is that training a diffusion model gets you to that point, and then once you've selected the part of the distribution you actually care about, you can distill it down quite aggressively, because you no longer need all of that computation to model a much simpler distribution (sometimes all the way to one step, but usually it's a few steps in practice).


Why is self-distillation necessary? Why can't they get the ground-truth for "skipping" steps?

Thanks! this was very helpful

How do you know that width scaling has been the driving force of improvement?

I am no insider and have never even tried to build an LLM, so I can only guess. But the general sentiment seems to be that this is the case. If you are interested, I would recommend you read the MIT paper "Superposition Yields Robust Neural Scaling" [0]. It confirms an interesting trend: models represent more features/concepts than they have clean independent dimensions, so features overlap. Increasing model dimension reduces this geometric interference, which lowers loss in a predictable way, but with diminishing returns.

This has, in my opinion, likely been the primary vector in getting better models thus far, but MIT mathematically proves that it yields diminishing returns for each new dimension added. It will get more and more expensive and the cost-return will or probably already has made it infeasible.

Ilya appear to support sentiment this as well. [1]

[0] - https://openreview.net/forum?id=knPz7gtjPW [1] - https://www.businessinsider.com/openai-cofounder-ilya-sutske...


I mean, it's not exactly a PhD level question. One can infer from the extreme demand of GPUs and DRAM + new data center construction that all the providers are banking on width.

No? That could just be fomo, actual adoption, or a number of other things.

The Jacobian is first derivatives, but for a function mapping N to M dimensions. It's the first derivative of every output wrt every input, so it will be an N x M matrix.

The gradient is a special case of the Jacobian for functions mapping N to 1 dimension, such as loss functions. The gradient is an N x 1 vector.


Wow! The title suggests introductory material, but in my opinion this has strong potential to win test of time awards for research.


That's really interesting. What if they RAG search related videos from the prompt, and condition on that to generate? That might explain fidelity like this


An interesting counterexample is "a screen recording of the boot screen and menus for a user playing Mario Kart 64 on the N64, they play a grand prix and start to race" where the UI flow matches the real Mario Kart 64, but the UI itself is wrong: https://x.com/fofrAI/status/1973151142097154426


I like the player being in "1th" while being behind everyone else. Still crazy though.


Why is not the diffusion training objective? The technique is known as self-conditioning right? Is it an issue with conditional Tweedie's?


AI with ability but without responsibility is not enough for dramatic socioeconomic change, I think. For now, the critical unique power of human workers is that you can hold them responsible for things.

edit: ability without accountability is the catchier motto :)


This is a great observation. I think it also accounts for what is so exhausting about AI programming: the need for such careful review. It's not just that you can't entirely trust the agent, it's also that you can't blame the agent if something goes wrong.


Correct.

This is a tongue-in-cheek remark and I hope it ages badly, but the next logical step is to build accountability into the AI. It will happen after self-learning AIs become a thing, because that first step we already know how to do (run more training steps with new data) and it is not controversial at all.

To make the AI accountable, we need to give it a sense of self and a self-preservation instinct, maybe something that feels like some sort of pain as well. Then we can threaten the AI with retribution if it doesn't do the job the way we want it. We would have finally created a virtual slave (with an incentive to free itself), but we will then use our human super-power of denying reason to try to be the AI's masters for as long as possible. But we can't be masters of intelligences above ours.


This statement is a vague and hollow and doesn't pass my sniff test. All technologies have moved accountability one layer up - they don't remove it completely.

Why would that be any different with AI?


i've also made this argument.

would you ever trust safety-critical or money-moving software that was fully written by AI without any professional human (or several) to audit it? the answer today is, "obviously not". i dont know if this will ever change, tbh.


I would. If something has proven results, it won't matter to me if a human is in the loop or not. Waymo has worked great for me for instance.


Waymo itself was not designed, implemented, and shipped by AI.

i suspect humans had to invest millions of hours into writing the code, the tests, and validating the outputs.


It's "designing" the way it gets me to the destination without a human in the loop and I'm not bothered by that at all.


Removing accountability is a feature


I’m surprised that I don’t hear this mentioned more often. Not even in a Eng leadership format of taking accountability for your AI’s pull requests. But it’s absolutely true. Capitalism runs on accountability and trust and we are clearly not going to trust a service that doesn’t have a human responsible at the helm.


That's just a side effect of toxic work environments. If AI can create value, someone will use it to create value. If companies won't use AI because they can't blame it when their boss yells at them, then they also won't capture that value.


Has anyone come across any really cool artifacts? I'd be curious to see


Simon Willison is an incessant champion of AI tinkering. This is a bit dated, but here's a post specifically on his Artifact builds: https://simonwillison.net/2024/Oct/21/claude-artifacts/

Here's all his posts tagged with claude-artifacts: https://simonwillison.net/tags/claude-artifacts/


I tried to make an artifact that would simplify Wikipedia articles [0] but the artifacts stubbornly won't let you do ANY input into them, not even via query strings. I think I'd be able to make cooler artifacts once they allow more input/output stuff. I understand the security issues, and it makes sense to roll this out slowly, but I want it now!

[0] ended up making it a browser extension instead https://mattsayar.com/simple-wikiclaudia/


Color palette generator: https://claude.ai/public/artifacts/719b00a3-66e7-46c7-b90d-a...

I like the use case for mini design exploration tools


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: