More

mashlol · 2026-01-10T22:51:17 1768085477

I'm not an expert, but as I understand it there are existing solvers for poker/holdem? Perhaps one of the players could be a traditional solver to see how the LLMs fare against those?

projectyang · 2026-01-11T08:47:21 1768121241

While others have commented about solvers, I'd also like to bring up AI poker bots such as Pluribus (https://en.wikipedia.org/wiki/Pluribus_(poker_bot)).

This also wouldn't even be a close contest, I think Pluribus demonstrated a solid win rate against professional players in a test.

As I was developing this project, a main thought came to mind as to the comparison between cost and performance between a "purpose" built AI such as Pluribus versus a general LLM model. I think Pluribus training costs ~$144 in cloud computing credits.

darepublic · 2026-01-11T21:32:42 1768167162

Should be noted that this bot is heads up only? I believe a form of heads up poker is effectively solved as well-- limit hold'em heads up

lowbatt · 2026-01-10T22:59:12 1768085952

the LLMs would get crushed

cowthulhu · 2026-01-10T23:15:59 1768086959

To expand on this - an LLM will try to play (and reason) like a person would, while a solver simply crunches the possibility space for the mathematically optimal move.

It’s similar to how an LLM can sometimes play chess on a reasonably high (but not world-class) level, while Stockfish (the chess solver) can easily crush even the best human player in the world.

postpriorx · 2026-01-10T23:22:30 1768087350

How does a poker solver select bet size? Doesn't this depend on posteriors on the opponent's 'policy' + hand estimation?

Reason077 · 2026-01-11T10:32:41 1768127561

GTO (“game theory optimal”) poker solvers are based around a decision tree with pre-set bet sizes (eg: check, bet small, bet large, all in), which are adjusted/optimized for stack depth and position. This simplifies the problem space: including arbitrary bet sizes would make the tree vastly larger and increase computational cost exponentially.

boscillator · 2026-01-10T23:32:03 1768087923

No, I'm not super certain, but I believe most solvers are trained to be game theory optimal (GTO), which means they assume every other player is also playing GTO. This means there is no strategy which beats them in the long run, but they may not be playing the absolute best strategy.

sejje · 2026-01-10T23:54:37 1768089277

Typically when you run a simulation on a hand, you give it some bet size options.

To limit the scope of what it has to simulate.

It's unlikely they're perfect, but there's very small differences in EV betting 100% vs 101.6% or whatever.

meep_morp · 2026-01-11T02:01:37 1768096897

Not only to limit the scope of what it has to simulate, but only a certain number of bet sizes is practical for a human to implement in their strategy.

iberator · 2026-01-11T12:20:36 1768134036

Nash equilibrium. Optimal strategy for online poker has been known for like literally 20 years right now

bogzz · 2026-01-10T23:24:37 1768087477

How would an LLM play like a human would? I kind of doubt that there is enough recounting of poker hands or transcription of filmed poker games in the training data to imbue a human-like decision pattern.

meep_morp · 2026-01-11T02:04:41 1768097081

I don't have an answer, but there's over a decade of hand history discussions online from various poker forums like 2p2 and more recently Reddit.

Terr_ · 2026-01-11T12:22:17 1768134137

Also, if you set the bar for human players low enough, pretty much any set of actions is human-like. :p

FergusArgyll · 2026-01-11T00:10:22 1768090222

You are of course correct but to be pedantic:

Stockfish isn't really a solver it's a neural net based engine

DiscourseFan · 2026-01-10T23:24:59 1768087499

Unlike Chess, in poker you don’t have perfect information, so there’s no real way to optimize it.

tim-kt · 2026-01-10T23:37:03 1768088223

You can still optimize for the expectation value, which is also essentially poker strategy.

DiscourseFan · 2026-01-11T22:23:58 1768170238

Anybody who plays poker “optimally” is bound to lose money when they come up against anyone with skill. Once you know the strategy your opponent is employing you can play like you have anything. I believe I’ve won with 7,2 offsuite more than any other hand, because I played like I had the nuts.

cowthulhu · 2026-01-12T06:32:01 1768199521

This is completely wrong - the entire point of the Nash equilibrium solution (in the context of poker, at least) is that it is, at worst, EV-neutral even when your opponent has perfect knowledge of your strategy.

Your 72o comment indicates you are either playing with very weak players, or have gotten lucky, as in reasonably competitive games playing (and then full bluffing) 72o will be significantly negative EV. Try grinding that strategy at a public 10/20 table and you will be quickly butchered and sent back to the ATM.

DiscourseFan · 2026-01-12T19:15:24 1768245324

There are numerous videos of high level professional poker players winning large hands with incredible bluffs, this whole "Nash equilibrium solution" is nothing more than a conjecture with some symbols thrown in. I will re-iterate, there is no such thing as perfect knowledge when you have imperfect information. If you play "optimally," you will get bluffed out of all your money the moment everyone else at the table figures out what you're doing.

sejje · 2026-01-10T23:38:48 1768088328

The solvers don't typically work in real time, I don't think. They take a while to crunch a hand.

dmurray · 2026-01-11T01:09:56 1768093796

"Solvers" normally means algorithms which aim to produce some mathematically optimal (given certain assumptions) behaviour.

There are other poker playing programs [0] - what we called AI before large language models were a thing - which achieve superhuman performance in real time in this format. They would crush the LLMs here. I don't know what's publicly available though.

[0] e.g. https://en.wikipedia.org/wiki/Pluribus_(poker_bot)

sejje · 2026-01-11T12:46:36 1768135596

Solvers, in a poker context, are a category of programs. They run a simulation after you enter the known information.

Like piosolver, as an example.

The best poker-playing AI is not beatable by anyone, so yes, it would crush the LLMs.

mashlol · 2025-12-30T15:25:33 1767108333

AI almost always reduces the time from "I need to implement this feature" to "there is some code that implements this feature".

However in my experience, the issue with AI is the potential hidden cost down the road. We either have to:

1. Code review the AI generated code line by line to ensure it's exactly what you'd have produced yourself when it is generated or

2. Pay an unknown amount of tech tebt down the road when it inevitably wasn't what you'd have done yourself and it isn't extensible, scalable, well written code.

kentm · 2025-12-30T19:27:52 1767122872

#2 is happening a lot more than people think. It’s incredibly hard to quantify tech debt in software and so as a result productivity measurements are pretty inaccurate. Even without AI there is a trend of devs writing a barely working system and then throwing it over the wall to “maintenance programmers”. Said devs are often rated highly by management as being productive compared to the “maintenance devs,” but all they really did was make other people deal with their garbage. I’ve seen these sorts of systems take months to years to be production ready while the original dev is already off to their new gig (and maybe cluelessly bragging on HN about how much better they are than the people cleaning up their mess).

To get an accurate productivity metric you’d have to somehow quantify the debt and “interest” vs some alternative. I don’t think that’s possible to do, so we’re probably just going to keep digging deeper.

jimbo808 · 2025-12-30T15:33:48 1767108828

RE 2: It's not that far down the road either. Laziliy reviewed or unreviewed LLM code rapidly turns your codebase into an absolute mess that LLMs can't maintain either. Very quickly you find yourself with lots of redundant code and duplicated logic, random unused code that's called by other unused code that gets called inside a branch that only tests will trigger, stuff like that. Eventually LLMs start fixing the code that isn't used and then confidently report that they solved the problem, filling up a context window with redundant nonsense every prompt, so they can't get anywhere. Yolo AI coding is like the payday loan of tech debt.

DougN7 · 2025-12-30T15:45:48 1767109548

This can happen sooner than you think too. I asked for what I thought was a simple feature and the AI wrote and rewrote a number of times trying to get it right, and eventually (not making this up) it told me the file was corrupt and could I please restore it from backup. This happened within about 20-30 minutes of asking for the change.

jennyholzer3 · 2025-12-30T16:23:00 1767111780

This is why I say LLMs are for idiots

brightball · 2025-12-30T15:29:44 1767108584

Exactly. Optimizations in one area will simply move the bottleneck so in order to truly recognize gains you have to optimize the entire software pipeline.

nradov · 2025-12-30T15:33:15 1767108795

Exactly right. It turns out that writing code is hardly ever the real bottleneck. People should spend some time learning the basics of queueing theory.

http://lpd2.com/

linsomniac · 2025-12-30T15:41:25 1767109285

>Code review the AI generated code line by line

Have you considered having AI code review the AI code before giving them off to a human? I've been experimenting with having claude work on some code and commit it, and then having codex review the changes in the most recent git commit, then eyeballing the recommendations and either having codex work the changes, or giving them back to claude. That has seemed to be quite effective so far.

Maybe it's turtles all the way down?

mashlol · 2025-12-29T17:48:57 1767030537

> which is what every other major studio would have done in its place

Afaik CDPR doesn't make many games. If one flops, that might be the end of them. I don't see abandoning a game as a valid option for them from a financial perspective. Makes much more sense to fix the issues and sell more.

Aeolun · 2025-12-29T19:06:20 1767035180

I think it’s more related to their reputation? People will buy the next one if they trust CDPR will fix anything wrong with it even if it flops.

Kinda how you trust paradox strategy titles to get several years of updates and expansions.

mashlol · 2025-10-24T02:27:03 1761272823

Would you say the same if someone played 1000 hours of a sport?

beeflet · 2025-10-24T02:38:02 1761273482

No. If you play 1000 hours of a sport, you will at least be stronger, more coordinated, more agile. But the downsides are more about repetitive strain injury and the possibility of screwing up your joints.

Different benefits and downsides.

Of course, a lot of guys are suckered into sports-related gambling these days too.

immibis · 2025-10-24T08:12:31 1761293551

Video gaming has been shown to train some brain areas too. It's definitely better than 1000 hours of Netflix.

jalapenos · 2025-10-24T09:12:00 1761297120

That's a very fair take

user432678 · 2025-10-24T06:35:40 1761287740

How about 1000 hours reading/commenting HN?

bombcar · 2025-10-24T08:34:50 1761294890

dang should enable selling posts and create a secondary market. My posts with the most upvotes can be sold to you and now YOU’RE the famous one!

ang_cire · 2025-10-24T02:52:24 1761274344

You don't think that you get better at CS the more you play it? Better coordination, better accuracy, etc?

beeflet · 2025-10-24T03:56:01 1761278161

you don't get better at real life the more you play it

Pooge · 2025-10-24T05:28:34 1761283714

Playing football for 1000 hours doesn't make you better at any other job (i.e real life).

Don't be so close-minded; playing games is not different from any other activity.

beeflet · 2025-10-24T05:53:38 1761285218

only because the jobs of our time are fake.

Playing football or lacrosse is more "real" than working a desk job. For thousands of years, humans had to hunt and make tools and relied on their wits and strength to survive. Survival in the modern day is mostly a question of obedience.

I think the purpose of exploring virtual worlds like quake or counter-strike or something should not be to escape the real world but rather to experience a new kind of physicality. The purpose of playing games should be to engage in a deeper world which is more "real" than the tame one we are ordinarily subjected to.

It's why I am not opposed to video games. I opposed to overplaying video games because you ruin them, they become mundane and predictable.

immibis · 2025-10-24T08:13:17 1761293597

It's not "more real" or "more useful" just because our ancient ancestors had to do it.

jalapenos · 2025-10-24T09:12:29 1761297149

Saddest thing I've heard today

asukachikaru · 2025-10-24T03:04:34 1761275074

How about 1000 hours of chess? Or 1000 hours of warhammer? Or D&D?

One may say you make social bonds playing them, but that stands true for video game as well. Speaking for myself, I definitely spent more than 1000 hours on summoner's rift; 15 years later me and my league friends still playing LOL together and chat about all kind of things on a daily basis.

jalapenos · 2025-10-24T09:07:52 1761296872

Plus you'll have friends who play sports, rather than the kinds of people who spend all night clicking on each other

mashlol · 2025-08-25T18:17:56 1756145876

Definitely seems like it could be useful, but I'd be worried with giving AI write access to emails.

Is there a good audit trail of exactly what actions it takes at each step? I'd personally be worried about leaking proprietary or otherwise private information this way, or having it hallucinate information when it sends out emails potentially causing catastrophic issues.

vedhsaka · 2025-08-25T18:26:49 1756146409

Valid concern - April does not write emails for you unless you specifically ask for it. Users usually dictate what they want to reply.

But do you think a 'safe mode' - where April does only non destructive operation like read/summarize/draft/move emails to a folder would help you build trust?

It's in our pipeline - we can prioritize it to mitigate that fear.

zacharycohn · 2025-08-25T19:24:21 1756149861

I started building basically April last week. I have a "safety" toggle in my app. If it's on, there's a "Review Actions" tab that any write or destructive actions go to. Then when I'm done dictating/commuting/whatever, I open the Review tab and go through the actions (add this calendar event, send this text message, reply to this email, etc) one by one - it sort of works like a checklist.

Feel free to take the idea, if it's helpful. No credit/rights necessary. Y'all are much farther along than I am and if you come out with an Android app I'll probably end up a customer!

pavel_lishin · 2025-08-25T19:14:12 1756149252

> April does not write emails for you unless you specifically ask for it.

What if it thinks you asked for it?

jvwww · 2025-08-25T20:25:36 1756153536

Feels pretty easy to mitigate against. If a user deselects "allow email sending", then you can just remove that as a possible tool-call so it becomes impossible.

tryitnow · 2025-08-25T19:00:26 1756148426

Yes, a safe mode would be great. I think it's a "nice to have" for a lot of early adopter (type of people who read HN), but it will be a "must have" more corporate types (a much bigger market).

kitchi · 2025-08-25T19:26:07 1756149967

Absolutely, having the AI agent write out a draft and leave it there, or better yet grant it read-only access to my email and have it draft email responses and store it somewhere else where I can retrieve it would be fantastic.

AI is still not at the point where I am comfortable letting it run free with my email, but a draft that I can read over and make changes to before sending it out is a game changer.

jFriedensreich · 2025-08-26T11:54:02 1756209242

Its the most frightening naive reply i could imagine, if you can ask for it, it can hallucinate you asking for it or it can get prompt injected you asking for it. for voice only agents without UI approval process the only way is to have a separate clean room permission agent that does only get absolute safe context not even aggregate email titles. also for emails its impossible to design a safe agent that does any sort write action after reading anything in a mailbox because the mailbox is by definition tainted third party data and personal sensitive at the same time. even moving to a folder without can be used for attacks by hiding password reset notification mails etc.

smt88 · 2025-08-25T18:44:26 1756147466

Safe mode is absolutely necessary. I'd never let an LLM do things for me. They repeatedly prove that they categorically can't be jailed or trusted.

dfee · 2025-08-26T05:45:33 1756187133

> April does not write emails for you unless you specifically ask for it.

> But do you think a 'safe mode' - where April does only non destructive operation like read/summarize/draft/move emails to a folder would help you build trust?

> It's in our pipeline

Wat

vedhsaka · 2025-08-26T06:25:58 1756189558

Means April will not send emails even if you dictate the email and ask it to send it. In safe mode, it will not have access to tool calls which are related to send email, move to trash.

SamBam · 2025-08-26T02:41:15 1756176075

> April does not write emails for you unless you specifically ask for it. Users usually dictate what they want to reply.

> Send replies that I dictate (it handles the formatting and tone)

How does it handle the tone without editing your dictation?

monkeydust · 2025-08-25T18:51:50 1756147910

Yes, think this needs to be way up on your priority list.

vedhsaka · 2025-08-26T02:21:01 1756174861

Point taken - Safe mode goes out this week.

rukuu001 · 2025-08-26T04:00:20 1756180820

I am building just such a thing, with an overall limitation that it will only interact with emails that only have our team on them. It’s fun

mashlol · 2025-06-04T19:11:36 1749064296

What makes it easier than US taxes?

mashlol · 2025-05-13T07:50:37 1747122637

FWIW git lfs does have support for locking files.

mashlol · on Oct 12, 2024

Do you find retail store sales to be unethical too?

eszed · on Oct 12, 2024

Most of them? Yes.

The situation where "hey, we've got too much of [this] because [whatever reason], so we'll mark it down in order to sell it" is how a Free Market is 'supposed to work': prices operating as a signaling mechanism, by which everyone receives all of the (relevant) information.

Price-manipulation strategies juice sales by exploiting buyers' psychological reward mechanisms. Defenders of current practice will say it's OK because willing-buyer-willing-seller - which is certainly true - but everything about those techniques injects noise into the price-signals that make market economies efficient goods-distribution systems.

I'm kind of a free-market fundamentalist, and think any marketing beyond informational marginally contributes to market failure.

Yes, I know nearly everyone on this board makes their money downstream from manipulative marketing practices, so it's easier to close our eyes to the consequences. (I'm not playing the purity card, by the way: my company does very little marketing, but it manipulates other psychological reward systems in equally destructive-to-humanity ways.) We're all complicit in building the systems we (should) deplore.

em-bee · on Oct 12, 2024

retail stores have inventory and limited space to store it. they do need to get rid of old inventory before they can store new products. because of long timespans from production to delivery, they need to anticipate demand, and sometimes they get that wrong and they don't sell their current inventory before the new stuff arrives. then they can either try to sell more through sales, or rent extra storage.

mashlol · on Sept 3, 2024

Greppable commit messages and descriptions are also important, for a similar reason. If you want to learn where a feature exists in the codebase, searching the commits for where it was added is often easier than trying to grep through the codebase to find it. Once you've found the commit or even a nearby commit, it's much easier to find the rest.

mashlol · on June 30, 2024

I watched the video but I wasn't fully sure what the scam was. Where it became unclear to me was where it transitioned from a chat to an app.

The app mentioned in his video (MetaTrader 5) is still up - and seems actually legit... at least I think?

So is the scam that they send links to fake versions of the app? How'd the reviews look legit then? Or is there some sort of scam they run on the app where they actually have control of your account?

EDIT: nevermind, I found this[1] post that explains it - the app connects to brokers and is not one itself. So they basically just make a fake brokerage and convince you to use it. So John Oliver's explanation was a bit lacking on that part, and misleading/incorrect about MetaTrader 5 itself.

[1] https://old.reddit.com/r/explainlikeimfive/comments/1b4070o/...