In a game of perfect information and no randomness, there are ultimately only tw...

SamBam · on July 5, 2021

> And it's impossible to raise your estimated score,

No, what I was saying is that it's absolutely possible to raise your estimated score, because your estimated score is only an estimation of who has the best position.

If AlphaZero is better than Stockfish, then by definition it will make moves that sometimes raise its estimated score, because Stockfish is only as good as its ability to estimate the score of a position better. So Stockfish must occasionally underestimate a position, and then later (after another move or two) is forced to reevaluate (because while it's worse, it's not stupid).

AlphaZero wins because, and precisely because, it believes some positions are more favorable than Stockfish does. You can almost see it as an arbitrage between the two estimations. That's what I was finding cool, and the point of my post.

vikingerik · on July 7, 2021

You're right, of course. What we're really talking about is the fallibility of estimations (and arbitraging between them) - you can't raise your score as projected by an omniscient computing power, but you can as estimated by real engines limited by their fallibility (and AlphaZero is less fallible.)

Mostly I'm pointing out that these estimations represent the best guess of an ultimately limited engine. People tend to treat those engine evaluations as actual numbers, like scores in a sport like baseball or some such, but they're not.

bluecalm · on July 5, 2021

There is a way to objectively distinguish moves from the same category though (drawing ones or losing ones). It's similar to Kolmogorov complexity. Let's say moves A and B both draw but the shortest algorithm that draws vs A is much longer then one that draws against B. We can say A is objectively a better move. In practice instead of formal definition we could use a benchmark engine. What's minimum CPU time/RAM requirements for an engine to draw the resulting positions (or convert them to a win in case of losing moves).

A move that requires serious hardware to defend against is better than one a 10 years old laptop can hold a draw against.

sweezyjeezy · on July 5, 2021

In a world with perfect play this is true, but in the real world there could still be moves that are good assuming imperfect play by the opponent. That's where things get really interesting.

For example I remember the original alphazero model that had been trained specifically against stockfish, would often take a material sacrifice for some advantage that stockfish couldn't see (e.g. sacrifice a pawn, but their bishop gets locked out of the game). I don't know if these moves were objectively good given perfect play, but they could be the only way to win now (chess is very drawish at the top computer level).