Yeah, as I was toggling "blue" / "green" / "blue" / "green" I had the distinct sensation that it might just show me that I was in a region where I couldn't even make a consistent distinction.
> “The raw output of ChatGPT’s proof was actually quite poor. So it required an expert to kind of sift through and actually understand what it was trying to say,” Lichtman says. But now he and Tao have shortened the proof so that it better distills the LLM’s key insight.
Interestingly, it was an elegant technique, but the proof still required a lot of work.
Which is difficult, because the fact that you can come up with your example questions tells us they're probably not very dangerous. Plenty of ink has been spilled about how LLMs could help people create bioweapons. The basic idea "you could do dangerous things with an LLM" is already pop culture, and you're not doing anything dangerous by giving easy example questions.
A dangerous question would have to be along the lines of "Could I use unobtanium with the Tony Stark process to produce explosives much more powerful than nuclear weapons?" so that the question itself contains some insight that gets you closer to doing something dangerous.
Perhaps the reason for not publishing the questions is twofold:
1) they want a universal jailbreak that can get the model to answer any "bad" question.
2) they don't want bad publicity when someone not under NDA jailbreaks their model and answers their question
That's a real question, maybe the changes are useful, though I think I'd like to see some examples. I do not trust cognitive complexity metrics, but it is a little interesting that the changes seem to reliably increase cognitive complexity.
I haven't previously thought about this, but I think words over a commutative monoid are equivalent to a vector of non-negative integers, at which point you have vector addition systems, and I believe those are decidable, though still computationally incredibly hard: https://www.quantamagazine.org/an-easy-sounding-problem-yiel....
There is a difference that someone smoking nearby automatically harms people around you. With alcohol, the effect is more unpredictable, but it is equally real.
Alcohol is a factor in an automobile crashes, and a factor in a significant proportion of violent crime, especially domestic violence (https://www.cato-unbound.org/2008/09/17/mark-kleiman/taxatio... edit: this source isn't as great, Kleiman has written elsewhere about the subject, but google is failing me). If we could wave a magic wand and cause drinking to cease to exist, many lives would be saved.
Note: I do in fact drink, I am not a teetotaler. But what I said above is factual. I personally believe that prohibition would be worse, and it's reasonable for individuals to make their own choices. But that does not entail denying that it goes very badly for many.
Second-hand smoke does affect people around you. It is how people get addicted to nicotine. It is how new smokers are created.
And there are some people who are more sensitive to temporary exposure to smoke (and pollution in general) than others.
That is why smoking tends to be is banned around hospitals and day care centers — because those are places where you will find those people.
My father was one of them, after he had got his larynx removed for throat cancer after having smoked for decades. He could not suffer being subjected to even small amounts of second-hand smoke again because then the breathing hole in his throat would get irritated, fill up with mucus and have to be cleaned with a suction device.
And if you drink alcohol next to me, it does not make my clothes and my hair stink so much afterwards that I will want to wash my hair and change my clothes before going to bed.
This is an interesting analysis, but "are the costs of AI agents also rising exponentially is?" is a very bad question that this doesn't answer.
What's rising exponentially is the price of the most ambitious thing cutting edge agents can do.
But to answer whether the cost of AI agents is rising in general, you would take a fixed set of problems, and for each of them, ask "once it's solvable, how does the price change?"
For that latter question, there isn't a lot of data in these charts because there aren't enough curves for models of the same family over time, but it does look like there are a number of points where newer models solve the same problems at lower prices. Look at GPT5 vs. the older GPT models--the curve for GPT5 is shifted left.
The cost of models are almost exponentially decreasing with time.
The author performs a non sequitur by muddling two concepts of time. They say costs are getting “unsustainable” which is not a conclusion that follows.
What is true is that at a given point in time, cost to perform a task is exponentially related to the human time taken. But it does not mean it will remain that way.. far from it.
It would be much funnier, and also more insightful, if it didn't do this and let you contradict yourself.
reply