I feel like I've been playing against it a lot in chess.com, with the tremendous amount of cheating that is going on online, every 1 out of 6 or 7 games is an engine user (I know because I report them, and they get banned).
It's great to have a better engine, but I feel that would benefit the most the online chess community is not a better engine, but an open source cheat detection system, if that's even possible.
I wouldn't know how to build one, but I think that is a lot more important for chess right now, still it's great to have a better engine so congratulations and thank you to the Stockfish team.
It's odd that it would be 1 out of 6 or 7, because when high elo streamers like Hikaru or Chessbrah are doing speed runs, they very often go on a 100 win streak, which sounds exceedingly unlikely if the odds of facing an engine user would be 1 out of 6/7.
An engine user would definitely beat them, unless they were using it sparingly of course, which can be, but I don't think it would be that obvious for you in this case as well.
There are examples where they face an engine and it's obvious, but it doesn't seem 1 out of 7 times.
I guess the problem is that most users that play with an engine don't play bullet/blitz.
I do meet a lot of engine players on 3 minute blitz but then I just do very fast bullet moves and all of a sudden I'm losing with a 90 second advantage that cannot be recovered if the user persists on playing with an engine.
Maybe, but there are also few rapid "speed" runs, like from Daniel Naroditsky, while he has also faced cheaters, it doesn't seem 1 out of 7.
And Daniel Naroditsky definitely would have good internal cheating detection even when the user does it sparingly, as he can basically understand most lower rated opponent moves, and if it seems too good for this rating, he can know this from just few moves.
If there were 1 out of 7 opponents using an engine, he would never have 145 wins against 0 losses.
And again for bullet and blitz there are also quite many examples with 100W to 0L.
If you'd look at all the speed run videos Daniel has done, it definitely doesn't seem 1 out of 7.
Daniel also posts all the times he's facing a cheater to youtube, same with Chessbrah and others as it makes for a good content, good views as people are always interested in seeing a GM playing against a cheater.
Not sure what the aim of cheating is, probably going from a rating of 1500 to 2000 and the status that it brings. In the end you still win 50% of the games, just against higher rated players.
Playing against a GM would reveal your cheating instantly, it seems. It's like robbing the police station :)
People like steamrolling. It's part of why higher rated players smurf (for theirs and/or their audience's enjoyment) and do 'speedruns' in chess and most online games with ranking.
Hikaru and Chessbrah are playing at GM level. It'd be difficult to cheat at this level. Every player tends to recognise every other player, and it takes time to climb to this elo. A far more representative example would be to observe games from a NM or IM. These guys often bump into cheaters. 1 in 6/7 isn't unrealistic.
Not always. Hikaru does speedruns[1] using an alt acct with entry-level ELO to race to ELO 3000. I've watched a fair bit of this, and seen him encounter the odd cheater or suspect game, but much nearer 1% than 10% of the time.
I don't disagree with what you said, but for context: when these players do speedruns they typically start from 600 ELO and work their way back up to GM, so they face players at every level as part of that 100-game streak.
>It's odd that it would be 1 out of 6 or 7, because when high elo streamers like Hikaru or Chessbrah are doing speed runs, they very often go on a 100 win streak, which sounds exceedingly unlikely if the odds of facing an engine user would be 1 out of 6/7.
I would guess that even with the engine you would take some games to rank up to that high so if chess.com is good at banning cheaters most of them would probably get caught sooner.
Yes, but those streamers often start from 500-600. And often when they do actually face those cheaters, they have gone undetected for 100+ games, maybe at 2000 elo, and they seemingly only get banned because of the publicity they face against the streamer. Meaning chess.com cheating detection by itself hasn't managed to detect them for so long, and who knows how many are there wild in the open managing to do more than 100s of games, without facing any publicity and doing it a lot more intelligently than the obvious ones against the streamer.
Even against engines, a pro player will often win due to the other player running out of time due to the time loss in every move from copying the engine. You often see the game being slightly equal with the pro player slowly falling behind but then as the time starts to run out for the cheater they completely fall apart.
But those pros will usually make some sort of subtle (because they don't want to openly, straight up accuse unless chess.com has actually banned the opponent after) comment that is clear they think they are playing against a cheater, but the odds are definitely not 1 out of 7. I have been addicted to YouTube chess videos for quite a while now...
Chess engines destroy humans at speed chess. The time use creeps in if you're using a low-tech method of cheating (i.e. manually punching moves into your engine and the website).
I've built a cheat detection system for chess few years ago. Our client paid for creating a chess portal to play chess for money (not an idea I'd put my money into - and it's been dead after few months). Anticheat engine has worked like that:
- simulate a game using stockfish
- for each move (except few moves at the beginning) compare the move made by player with the list suggested by engine
- if the move chosen by player is on the list generated by engine, than give that player some points (depending on the position of the move on the list)
- do some math considering player's ELO and some other stuff (I can't remember exactly).
Definitely not an ideal solution, but also open for improvements. Btw it wasn't my idea - chess players provided the exact algorithm, so it must have been known.
I mean, such a system already exists when players do analysis of their matches. The thing that makes a move identifiable as a blunder is that a chess engine evaluating the board before and after the move.
It's also worth keeping in mind that you will sometimes see players match the best engine move 95% of the time or more at the 800-1000 elo's and they're not cheating, it's just their opponent is blundering and the next move is obvious.
So specifically, you have to find when players matched up with engine moves, where the engine decided on an optimal move by looking far into the future.
A small problem with using Stockfish is that cheaters may use other programs in order to not play like a top engine. For example, Stockfish would laugh at some old versions of Houdini, but Houdini still outplays any human easily.
Just wanted to propose this approach. But I wouldn't call it cheat detection. It's more of "make sure my opponent is not better than X". It could even be integrated into the game by showing the players "forbidden" moves, which are too good for the current game-level and are therefore not allowed to be played.
At what rating do you play? Around 1200 on chess.com? I don't see cheating as a big problem online for two reasons but I only play on lichess, not on chess.com.
First of all it's not that bad to play against a cheater once in a while. If you compare it with other games, playing against an engine is a huge disadvantage but will not fundamentally change the structure of the game. You are still playing chess, but against a superhuman opponent. You don't want to play against the computer but it's not as bad as the other player abusing a glitch in the game.
Secondly, I'm guessing that cheaters will mainly play at the entry level strength (1200 on chess.com) and a bit above that. If you are seriously cheating you will be caught very quickly. So maybe if you change your rating you might encounter less cheaters.
Edit: I just looked at your comment history to find out what your rating is and apparently you are playing (for an online game) with extremely long time controls? That's probably the reason why you are encountering many cheaters. The player pool for long online games is much much smaller, so you will automatically have more cheaters who just recently signed up for the game.
> If you compare it with other games, playing against an engine is a huge disadvantage but will not fundamentally change the structure of the game. You are still playing chess, but against a superhuman opponent
It wastes your time. Playing against a human is a different experience. If you actually wanted to practice against an engine, you would do so knowingly. With some possible benefits such as takebacks etc. (since computer is not a rival, just a training tool).
It wastes your rating points - if you play rated games. Obviously not everyone does, or cares about their online rating; but I do to an extent. For one, while rating isn't a goal in and of itself, it's still a convenient form of tracking my progress, and cheaters distort this measure.
Finally, it wastes your nerves. However insignificant this may be in the scheme of things, I think that most people still dislike being cheated or lied to (in any way or form) simply out of principle, and find that frustrating.
I thought something like lichess actually compares the player's moves against something like Stockfish and if it matches too closely, they flag that user.
> I feel like I've been playing against it a lot in chess.com, with the tremendous amount of cheating that is going on online, every 1 out of 6 or 7 games is an engine user
This statement seems a bit funny because in order to have a good idea that they cheated, you would have also had to been analyzing the game with the chess engine.
Regardless, unless you truly an amazingly player, nearly any chess engine made in the last 15 years will destroy you and incremental improvements on stockfish have absolutely not effect on that.
> This statement seems a bit funny because in order to have a good idea that they cheated, you would have also had to been analyzing the game with the chess engine.
You analyze the game after it is played. When your opponent managed to have a 99.9% accuracy in a 1500+ ELO blitz/rapid game, it's highly unlikely that they managed to do that without some computer assistance.
I'm 1500+ ELO, play blitz/raipd, and get 100% from time to time.
It's usually because I played some book moves and then my opponent fell into an opening trap that I knew and they didn't [1] and I knew exactly how to play for the win to checkmate, because I've done it before many times and remember the post game analysis from them. I didn't come up with the moves on the spot.
Sometimes yes, computers play a certain way and make moves that simply aren't intuitive to humans, especially not lower rated players.
However the most obvious cheaters are more easily given away by time between moves. When they take the same time between every move whether it be a deep positional move or an obvious recapture, you can be quite sure something fishy is going on. Sometimes they can have literally 1 legal move and still take 10 seconds to find it.
A good player using an engine sparingly however would be very difficult to spot in online chess, especially in a single match.
> Sometimes they can have literally 1 legal move and still take 10 seconds to find it.
Perhaps a nit-pick: they need to discover the legal move, and discover that no other moves are possible, right? As a rather basic chess player myself, I can imagine I might spend some time on this depending on the situation.
I think the idea is that an experienced player would recognise this situation immediately (usually there is only one legal move because you are in check, or there is only one move that doesn't put you in check), or even ahead of time (knowing that their opponent's move brings them in check) and just play the only possible move straight away.
Of course a beginner would take some time to spot this, but it's unlikely you would confuse a beginner for a cheater, since a beginner will likely make many sub-optimal moves and spend lots of time thinking in general.
Depending on the situation, the chess website/app might show you that there is only 1 left. Already, when in check, you can't play a move that doesn't stop or cancel that check.
I don't want to advertise cheating or give any info about cheating, but a "good" cheater can also cheat in bullet and may be even go more undetected if they were to mix moves well.
Note: I have not cheated myself, but I can definitely see a way it can be done. I'm not sure if it would be good of me to describe the process of course... Just think what input you can have and what output you can get if you were to do this programmatically and you can very well imagine if it's possible, there are also existing tools for that. You don't have to open a chess engine in another window and manually do the movements, if you know how to script.
In bullet, I think may be, you could technically even use something to do "anti blundering", meaning you will blunder a lot less, because engine will just check whether it would be an obvious blunder, and block your move. May be you just allow few blunders, and I imagine it would be undetected. Sorry, again for brain storming about that. It is fascinating topic though. Engine could be running on a lower depth and it could be more sort of positional engine that does not do magical engine moves, but is trained on human players and using neural network mostly. Maybe it will just help you do theory openings. And you won't be able to charge anyone for cheating for following opening theory.
At the end of the day it's akin to the "true randomness" problem. How do you prove a random numbers generator isn't truly, or fully random? Or that chess moves are "truly human", and not computed by an engine. You can only approach this probabilistically.
I also think the cheating detection algorithms can be beated, and I believe I could do that. Why do they still serve their purpose, more often than not?
To me, it's inherently linked to the very nature of online cheating. It's essentially a futile, nonsensical activity. The only gratification is an illusion of intellectual superiority, whose worthlessness is so transparent that it can only attract people who don't get to experience the sense of intellectual superiority pretty much anywhere else. As harsh as it may sound, your average cheater is rather stupid. That's why it isn't really difficult to catch 90% of them.
I don't rule out there are some cheaters who do it out of intellectual curiosity, but that would be a statistical outlier.
Well, obviously it's impossible to know with 100% certainty. Some heuristics include:
* Some cheaters will just 100% match the best engine moves. If a player consistently does exactly what Stockfish would do that's an obvious giveaway.
* Some cheaters will be manually copying moves between the chess website and their engine; in high-speed games ('blitz' and 'bullet' chess) their abilities plummet when there are only a few seconds left on the clock, because they can't copy fast enough.
* Similarly, a player who takes 5 seconds a move whether they're pounding out a basic book opening or making an inspired move in an extremely complicated situation will raise suspicion.
* Some cheaters will just be improbably good for their known background. A few weeks back some billionaire beat five-time world champion Vishy Anand in a charity game (where Anand played a bunch of different games at once) which is the chess equivalent of Mark Zuckerberg outrunning Usain Bolt.
* Chess engines will sometimes make moves that even the top humans fail to see. All the action is happening on the right of the board, and some innocuous move on the left of the board produces a perfectly executed forced mate in 15 moves? Some people will look at that suspiciously.
Of course, a sufficiently careful cheater could cheat without triggering any of these heuristics - a player who only relies on the engine for one or two key moves can easily be undetectable.
I guess a basic implementation might compare the moves performed by the player under evaluation against the move that an engine would suggest?
Presumably an excellent player might often make the same moves as an engine, so this measure alone isn't going to be perfect. But it could be a starting point. You might also look at the player's historical performance and watch for suspicious changes, or perhaps look for patterns in the time taken to play the move?
It's hard because often cheaters will play 'clean' until they get behind or to a particularly ambiguous spot, when they will cheat for one or two moves.
The really hard part of chess is what to do in the mid game, once you're off your scripted opening, there is still lots of material, and neither player has any significant vulnerabilities. A computer is useful here.
You could even just use computer to do near perfect theory openings and you can not charge any one for cheating here, and you can still gain more ELO there. There's so much variance you could do. One does not have to alt tab to input and mirror chess engine moves with Stockfish.
You can use neural network only, which is trained on human players, and have a programmatic way for AI to have suggestions on where to move show up on your screen. Time delay being pretty much non-existant, and maybe even good script can pre-move very obvious things for you. Maybe the AI will give you 3 non blunder moves, and you can yourself choose which makes the most sense or just block you from obviously blundering, but maybe like only 90% of the time.
I don't like giving out ideas here, but I feel like these are obvious ways one could cheat and go undetected.
> You could even just use computer to do near perfect theory openings and you can not charge any one for cheating here, and you can still gain more ELO there.
That would be hard to distinguish from someone memorizing an opening book. So hard to detect.
I guess you could compare the outputs for different engines with that of a "suspicious" player - if their moves match exactly, flag them, if it happens too many times, ban them. Something like that.
Someone who plays a lot against a computer will often pick up the computer's style.
But sometimes a computer will make odd moves no human would ever make. I've been playing against an ios stockfish app to relearn how to play, and when it gets behind it starts throwing material away to delay the inevitable; a human would more likely keep the material and hope the opponent doesn't see the path to victory.
One way to detect use of an engine is to look for moves like this, though if it could be done algorithmicly then that same algorithm could be used to make the engine play more like a human.
There is such an open source system: https://github.com/clarkerubber/irwin. It seems that the last commit was some years ago, so it's either very stable or no longer used. At any rate, this is what powers anticheat on lichess.org.
Good luck. Those of us in other gaming spheres have been running up against this problem for aeons now. We're not winning. I can only imagine that detecting cheating is even harder in chess.
Minor thought, but one of the things I love about high level AI chess, like Stockfish vs AlphaZero, is seeing how their ratings of positions change over the course of the game.
I had realized a while ago that, as a human, the computer will give your position a score, and then you make a move, and your score can pretty much only stay the same (if you make a "perfect" move) or go down. Much of the time, it goes down.
Two humans playing each other, it's just a question: who's score goes down less each time they make a move? It became a little sad. It seemed like either you can make the right move, or you make a sub-optimal move, and the winner is simply the one who makes the fewer sub-optimal moves.
But when AlphaZero plays, and you watch Stockfish's score, the reason it wins is that it makes moves Stockfish thinks is poor, so it rates AlphaZero's moves poorly, and then all of a sudden it has an oh shit! moment when it realises that AlphaZero is actually ahead, and its score jumps. It's really a look inside the computer's head while it's being beaten by a better player.
In a game of perfect information and no randomness, there are ultimately only two kinds of moves: those that preserve your current best forcible outcome (win or draw in chess*), and those that blunder that into a worse result given continued perfect play by the opponent.
Everything else like a positional score or centipawns or even classic material points is an abstraction, that we use to summarize because we don't have unbounded or sufficient computing power to solve all possible continuations. That score apparently going down is only an artifact of our limited ability to evaluate it; the only real scores are 0/½/1 for lose/draw/win. If you make mistakes, your score will evetually drop by those quantizations; we just typically don't know exactly when, except in endgame situations pared down enough to be computationally tractable.
And it's impossible to raise your estimated score, because that estimation assumes you continue to play perfectly. There's no such concept as a better-than-perfect move to raise your expectation over what was already calculated, since that calculation already includes all your best possible moves.
* (Other gradiations between win/lose/draw are possible in such a game. Chess doesn't have such, but imagine playing Go for a dollar per point, where nuances smaller than swinging a win or draw still matter.)
> And it's impossible to raise your estimated score,
No, what I was saying is that it's absolutely possible to raise your estimated score, because your estimated score is only an estimation of who has the best position.
If AlphaZero is better than Stockfish, then by definition it will make moves that sometimes raise its estimated score, because Stockfish is only as good as its ability to estimate the score of a position better. So Stockfish must occasionally underestimate a position, and then later (after another move or two) is forced to reevaluate (because while it's worse, it's not stupid).
AlphaZero wins because, and precisely because, it believes some positions are more favorable than Stockfish does. You can almost see it as an arbitrage between the two estimations. That's what I was finding cool, and the point of my post.
You're right, of course. What we're really talking about is the fallibility of estimations (and arbitraging between them) - you can't raise your score as projected by an omniscient computing power, but you can as estimated by real engines limited by their fallibility (and AlphaZero is less fallible.)
Mostly I'm pointing out that these estimations represent the best guess of an ultimately limited engine. People tend to treat those engine evaluations as actual numbers, like scores in a sport like baseball or some such, but they're not.
There is a way to objectively distinguish moves from the same category though (drawing ones or losing ones). It's similar to Kolmogorov complexity. Let's say moves A and B both draw but the shortest algorithm that draws vs A is much longer then one that draws against B. We can say A is objectively a better move.
In practice instead of formal definition we could use a benchmark engine. What's minimum CPU time/RAM requirements for an engine to draw the resulting positions (or convert them to a win in case of losing moves).
A move that requires serious hardware to defend against is better than one a 10 years old laptop can hold a draw against.
In a world with perfect play this is true, but in the real world there could still be moves that are good assuming imperfect play by the opponent. That's where things get really interesting.
For example I remember the original alphazero model that had been trained specifically against stockfish, would often take a material sacrifice for some advantage that stockfish couldn't see (e.g. sacrifice a pawn, but their bishop gets locked out of the game). I don't know if these moves were objectively good given perfect play, but they could be the only way to win now (chess is very drawish at the top computer level).
> Two humans playing each other, it's just a question: who's score goes down less each time they make a move? It became a little sad. It seemed like either you can make the right move, or you make a sub-optimal move, and the winner is simply the one who makes the fewer sub-optimal moves.
What you describe is when a perfect chess-playing computer (approx AlphaZero) observes two humans playing. What humans observe watching two humans playing (including the participating humans) is very similar to what you described AlphaZero vs Stockfish as. The only difference is we don't ask human players to ascribe a score to their opponent's move (and wouldn't expect it to be accurate)
This depends on the level of the player observing the game, compared to the players level. It's pretty common to have a strong player at a masters level pointing out subtle positional or tactical mistakes, but the enemy is not skilled enough to exploit those - so the mistakes do not matter at that level.
This is also something to keep in mind to not get discouraged: Just because every move is terrible to a 3500+ chess engine at some level, it does not mean these concerns always apply to you at half that rating.
> But when AlphaZero plays, and you watch Stockfish's score, the reason it wins is that it makes moves Stockfish thinks is poor, so it rates AlphaZero's moves poorly, and then all of a sudden it has an oh shit! moment when it realises that AlphaZero is actually ahead, and its score jumps. It's really a look inside the computer's head while it's being beaten by a better player.
That's more or less a description of what happens when two humans are playing over the board.
Right. Think about all the times you would answer “no” to “would you like to switch sides with your opponent right now?”, even as you are about to lose a few moves later.
> Two humans playing each other, it's just a question: who's score goes down less each time they make a move? It became a little sad. It seemed like either you can make the right move, or you make a sub-optimal move, and the winner is simply the one who makes the fewer sub-optimal moves.
It’s also evaluating your own position vs someone else’s - humans and computers are the same in that both will make the move they think is best, and will only have an oh shit moment when their opponent has provided a reply they didn’t expect.
The only difference is computers can see further, so while an oh shit moment for a human might be 4 moves out, with a computer it might be 20.
The "technical" term for what you describe is blundering. The winner amongst two humans playing each other is mostly defined by who blunders first. Some of these are sometimes so obvious that we have a take-back rule amongst friends for obvious giveaways. Mind you, its not "can I take this move back", its "you want to make another move, this one is too stupid"
What they are describing is that feeling you get when you suddenly realize you are losing even though you are even in pieces, because you suddenly see your opposition has superior positioning and board control
It's not a blunder, because there wasn't one particular move where the game slipped away.
The "oh shit" move doesn't necessarily make your position suddenly way worse, it can be just the moment when you realize how much better your opponents position has been for a while.
Hence why I talked about the feeling you get when you realise you have mis-evaluated rather than trying to define it precisely.
From the OP:
> But when AlphaZero plays, and you watch Stockfish's score, the reason it wins is that it makes moves Stockfish thinks is poor, so it rates AlphaZero's moves poorly, and then all of a sudden it has an oh shit! moment when it realises that AlphaZero is actually ahead, and its score jumps. It's really a look inside the computer's head while it's being beaten by a better player.
If you watch from here in game one you can see the "oh shit" moment, as Stockfish's evaluation drops from +1 to even to -1: https://youtu.be/Q5EPqM8gS7k?t=255
There were blunders in the first round of games. I remember this vividly because I went through all the games with the exact same version of Stockfish with the same time controls and it thought some of its own moves were blunders. This is partly why people were so puzzled by the setup that led to this.
A blunder is when someone makes a wrong move then and there that costs them the game. At AI levels, there are very few, if any, blunders. I think parent commenter means a strategic mistake, where they made a mistake about 5 moves in advance.
True. The technical definition of a blunder on lichess and chess.com is that it puts you in a losing position. You can blunder further while in the losing position. And your opponent can blunder which puts them in the losing position, only to be reversed by your blunder.
A mistake is like a blunder but you're still winning, but it could be a blunder in a worse position. An inaccuracy is a bad move that doesn't cost you.
At least on lichess, the blunder/mistake/inaccuracy distinction is based only on how much the evaluation moves. Going from +10 to +6 is still a blunder.
At the highest level, chess becomes an endurance sport. Magnus is so great because he can squeeze water from a stone. That is definitely admirable, but somewhat deflating compared to idea of the one brilliant move or insane positional play. Give a man bionic legs and it is no longer about how fast he can run but rather what new places he can go.
I agree, only because the ideas and brilliance/depth that they demonstrate is so breathtaking to behold. I love the chess.com Game of the Day playlist on YouTube, which occasionally has amazing computer chess games with early sacrifices for long-term positional compensation: https://www.youtube.com/watch?v=A-vNq61KfLs&list=PL-qLOQ-OEl....
Its kind of more interesting because you can find out more about whats happening 'under the hood'. When two people play its just 'clever brain vs clever brain' with no engineering details. When two AIs play you can say its this many racks of servers and teraflops and the other one has a model thats trained itself from scratch for a few hours or whatever and so when one of them starts to trample the AI opposition we actually learn something other than 'this person has a brain thats good at chess'.
I like the AI-Augmented humans. Kasparov has said that the AI advances are exciting for training. I heard Magnus Carlsen would observe the AI more than humans and infer how to play the game from them, rather than use more traditional methods. The Play Magnus Group invests a lot in AI-backed training tools.
For those of you with a little CPU time to spare, the fishtest project should be mentioned here. Stockfish has a "CI" which self-plays all PRs before accepting them. The client is easy to install. Go get it!
I see that, like ed(1), stockfish is generous enough to flag errors, yet prudent enough not to overwhelm the novice with verbosity.
$ ./stockfish_14_x64_bmi2
Stockfish 14 by the Stockfish developers (see AUTHORS file)
help
Unknown command: help
?
Unknown command: ?
eat flaming death
Unknown command: eat flaming death
Notice the diagram titled "NNUE derived piece values" underneath the table for "Contributing terms for the classical eval".
Edit: for anyone interested in the hand-tuned evaluator, it might be worth checking out the Stockfish evaluation guide [1]. (The board in the upper-right corner is interactive!)
The command line interface is designed for interoperability with chess GUIs according to an informal standard called UCI. UCI is sufficiently simple that you can talk to the engine manually if you really want to but it is much (much) easier to use a chess GUI. For example Tarrasch https://triplehappy.com (shameless plug for my chess GUI).
I suppose you can interact with stockfish by hand, but it is designed to plug into a chess board app like xboard or others: https://www.gnu.org/software/xboard/
Thanks :-] I was trying to figure out why stockfish debian package suggests xboard, but they don't work together. It is because stockfish suggests xboard and polyglot, but polyglot is required to use stockfish with xboard.
What would be the added value of an average and aspiring chess player like me to play against Stockfish 14 instead of one of he previous versions? I feel I'll never challenge the maximum capacity of such an engine.
> What would be the added value of an average and aspiring chess player like me to play against Stockfish 14 instead of one of he previous versions? I feel I'll never challenge the maximum capacity of such an engine.
I do not see chess engines as good sparring partners, but rather as tools for effective chess exploration.
The nice thing about these newer engines is that they are starting to be useful tools to explore more sacrificial and unbalanced positions which are really fun to get over the board against humans.
Finding these ideas with the help of a computer gives a competitive edge against other humans, as well as helping discover interesting corners in this vast game.
Play against? Probably very little. But certainly it'll give you sharper analyses of your other games, and slightly better lines to memorise as part of your preparation. This probably isn't where most people should be spending their time learning though, obv.
My impression from playing against lower levels of the default stockfish on Lichess is that it's really bad at playing badly, so not much fun to play against.
It seems like it plays at full power, interspersed with random bad moves.
Absolutely. You might want to take a look at Maia Chess, an engine which is supposed to mimic low/mid-level human play, rather than just playing perfectly and with a 1/10 chance of a terrible move, like Stockfish at low levels does.
That’s amazing. NNUE has been a boon for stockfish as it enabled it to find complicated ideas even on low depths, which are often more interesting than simple tactical perfection. Really excited to see stockfish 14.
So stockfish has long been superhuman strength, but there are kind of two separate reasons for it
1. Tactics - in middle game there are a lot of possible variations and humans are bad at looking at all the possibilities so they blunder some tactic (eg some move order combination of under 10 moves that leads to a mating attack or significant material advantage). This alone is close to enough to be superhuman.
2. Long term advantage building — a move order that has no clear tactical advantage but puts the player into a better position many moves into the future. This is something that old fashion chess engines used to need very high depth but with nnue stockfish started selecting these kind of move orders even at low depths.
Variations of type 2 are more interesting to people because this is something we can try to learn from and infer general rules (eg instigating with flank pawns without immediate conversion of the attack became more common after alphazero used this in many games to get long term advantage).
At the same time most people analyze their games in browser running on their laptop/mobile device (since most games happen on chess.com or lichess.org) so really they only get low depth stockfish variations.
Without strong ability in (2) it can be hard to interpret SF generated variations since it is still superhuman (eg it will beat a human from a given position) but it sometimes makes moves that are probably suboptimal (because there are no immediate tactics available). But you can’t really tell if a variation is because sf doesn’t know what to do or because there is some hidden tactic/etc. The normal “solution” is if you suspect the position is pivotal to just calculate to very high depth. But that takes time and not something an average player will do.
I think the most anticlimatic, but most effective way to make such an AI would be to run an engine like stockfish at maximum strength to turn a game into an endgame, and then use the endgame tables to convert the game to a forced draw.
Just train it to make it win. Then, anytime it is a bit in front, perform suboptimal moves, until things are even (those engines typically have a score on whether they are in front or not).
Not sure that would work in an end game though. If you are behind and make a good move, you accidentally win. If you are technically ahead and make a bad move, you probably lose.
If you can get reliably to an end game with only a king and a one or maybe two pieces left on each side, then maybe it can just be hand-scripted and bolted onto what you're proposing, but I'm not sure your proposal would consistently get it there.
Chess.com is brilliant, and I use it as well as the less popular (but also brilliant) Lichess.
Lichess is "cooler" because it's non-profit but honestly, comments like that are reminiscent of the childish anti-Microsoft barbs from Linux ideologues that have thankfully declined in recent years.
They pay streamers to make it look big, and then their viewers have to use chess.com if they want to play them.
I have fallen in love with chess again over the past few months, and lichess is an absolutely amazing experience. Their mobile clients (iOS/Android) are both fantastic and BS free.
If anyone is looking for a way to merge games from lichess.org and chess.com: I'm developing a website where you can link your accounts to view stats for all of your games. It's free and currently in Beta: https://www.chessmonitor.com/
I've been wishing someone would make something like this! Always frustrated me that tools like aimchess and openingtree only allow you to look at the two sources separately.. supposedly, at least in aimchess's case, the reason being that the rating systems are different.
How do you unify ratings?
Edit: ah I see you don't, makes sense I guess. Might be interesting to have them both on the same graph even if the y axis is different..
As you say, I currently don't. In the future I might check the correlation between chess.com/lichess (when enough users register). Then I should be able to even calculate the chess.com rating from a lichess rating and vice versa.
But there are many other features I want to implement first. I'm currently more focused on the statistics part than on the chess.com/lichess relation.
Yeah definitely should be the focus! Would be excited to see some of the stats aimchess has implemented.. performance in various phases of the game.. tactics.. advantage capitalization etc.
I also found that when clicking through to an opening on your site, it would always say "no games find at this position". Bug?
Thanks for the feedback. It's not really a bug. More of a communication problem... ;)
The openings page list your openings for white and black. If you click on an opening it takes you to the explorer which shows the stats for only one color (white by default). Therefore, if you play an opening for black a lot it will appear in the list of your openings. But when you click on the opening, the explorer will show your stats for white.
There is also another problem, that the detection of openings (on the openings page) respects transpositions [1] while the explorer does not.
Maybe I'll remove the link to the explorer as this seems to cause a lot of confusion...
Incorrect. Chess.com has a superior lessons system (it isn’t just text and a predefined board) and it has a superior computer to play against. In Lichess you have to use one of 7 levels in chess.com you have a sliding scale from 250-3200 and they have a lot more tools to analyze the game both during (when playing bots) or after. I use both but I swear that the comments I see dissing chess.com seem to be from people who are hardly familiar with it.
Seriously, I am a paying customer of chess.com -- I for the life of me could not get a new user signed up to play chess with. We wanted to play and selected the wrong level by mistake ( beginner ) -- it won't allow you to play even if you are friends on the site. Horrible user experience. You want to play variants on your phone, can't. go to the website.. Lichess is just awesome! Play anyone you want, variants on the app. Love it.
I basically only play Lichess now (>3k games last few years), but used chess.com once in a while still to play with friends having an account there. Never had the problem you describe, sounds like PEBCAK. The constant chess.com hating here and on reddit is kinda off-putting, though.
It is. I also use both platforms and you can’t even bring up preferring chess.com here or on Reddit’s chess subs without being downvoted to oblivion and dogpilled. It’s super cult like similar to the way Musk fanboys come out in droves if you even suggest he isn’t some great man of history.
I’m not the parent commenter, but I use Lichess. It’s free, supported by donations (so their incentives align with mine more than a business’), open source (which I like ideologically), and in my opinion just has a nicer user interface.
I started playing chess online less then a month ago. I started with chess.com, because I knew about it beforehand for some reason. After about a week there I couldn’t take their constant nagging about getting premium account: “Get a free 7 day trail”, “Get a premium account to unlock more analysis”, “you’ve reached your maximum puzzles for the day, get a premium subscription to unlock more”, etc.
I did some research (mostly on the online-go.com forums) and joined lichess.org and haven’t looked back since. Superior in every way.
Perhaps I should thank chess.com for being so annoying, if it weren’t for their constant nagging I would probably have stayed on the platform and never discovered lichess.org.
Humans can gain hundreds or thousands of points per year at the start, but then tend to plateau afaik, it's actually easy to check this out on peoples public lichess accounts.
Nitpick, certainly not thousands. When you're first starting out even if your elo is like 400, you'll be lucky to break 1000 after a year unless you study very diligently. The gap from 1000-2000 then takes multiple years (probably like 4-5 of diligent study)
I think reaching 1400 elo after a year of decent study after learning the rules is not too unreasonable. To reach 1400 in a year you would need to be either talented or hard-working. Climbing the elo ladder does get much harder though, as you mentioned. To reach 2000 in two years you would need to be both hard-working and talented. To reach 2850 in a lifetime you have to be hard-working, talented, and a particular Norwegian GM.
It's generally thought that beginners improve most quickly by playing slower chess (e.g. 15+10 or even 30+20). If you're a 1000 and you only play bullet, you may end up cementing some of your bad habits. Whereas with slower time controls, one has time to calculate some deeper lines and really try to understand the position. And if one spends a year getting to 1400 in rapid, one will find that one's gains from slower time controls will quickly bare fruit in the lower time controls (once you learn time management).
Generally speaking, players playing bullet are more interested in having fun and less interested in gaining elo (which of course is totally fine).
As with all things, it depends on how intense you are. I've seen a guy go from 900 to 2k in 6 months, but he played over 1k hours during that time (yeah). It really depends on how much you are willing to play!
Edit: Just to be clear, I'm talking about online rating
I started from beginner to about 1800 on lichess in about a year of casual play and watching some grandmasters on youtube. I am 30 years old, i think younger kids can do better and we know many prodigies do.
I guess this is different to official rankings and/or elo though, so that was my mistake.
Top players plateau in strength because human memory and processing power is limited, so they are limited in how much additional knowledge can improve their actual playing strength. For Stockfish, OTOH, algorithmic improvements (especially NNs) have steadily increased its strength.
Stockfish would beat AlphaZero handily. Even the original paper's test conditions were not fair, so Stockfish may have been much better than the results of the original matches made it seem.
Alpha Zero would not be competitive in its original implementation. It's neural net is too small (this is why it only took 4 hours to train it, since it stopped learning after 4 hours). Stockfish 14 just doubled its network size from Stockfish 13.
Stockfish 8 actually won games against Alpha Zero. But Stockfish 11 (which is still classical evaluation engine with no neural net support) totally decimates Stockfish 8, and Stockfish 13 (which uses neural net) totally decimates Stockfish 11. Stockfish 14 just got 30 elo points stronger than 13.
As it stands now, Stockfish 14 is the strongest chess entity humanity has ever seen.
I think this is a good idea, but in no way should it be the only restrained game. Others could be time restraints (total time per game and/or total time per move), depth restraints (if they both work off of BFS/similar), and probably many other restraints that those more familiar with the engines can come up with.
Most chess games are already time constrained (though a ref can call it if someone in a non-winning position is trying to run out the opponents clock)
[edit]
I was imperfectly remembering the rules. If it is theoretically impossible (even with blunders on the player who is out of time's part) for the player with time left to win, then it is a draw.
In addition, a player with less than 2 minutes on the clock may request a draw; see Article 10.2 which includes this subsection:
> a. If the arbiter agrees the opponent is making no effort to win the game by normal means, or that it is not possible to win by normal means, then he shall declare the game drawn. Otherwise he shall postpone his decision or reject the claim.
I'm not super into chess, but I was under the impression that running out the clock was a totally legitimate tactic. I'm surprised to hear that a referee has discretion to end the game based on it.
The issue is when positions are completely equal and there is no reasonable way to progress. It might still be technically possible to win in such positions, but it would require someone to make extremely bad moves and almost certainly lose. If there is no rule that forces draws in such positions, then players will just keep moving pieces without purpose until either someone's time runs out or 50 moves without capture / threefold repetition happens.
An arbiter 100% cannot stop a game because of that. Time management is part of the game in speed chess anyway, and for longer time controls there is usually a delay/increment so running out the clock in a clearly lost position isn't viable.
I slightly misremembered; the player stops the clock and calls the arbiter; if the arbiter agrees that their opponent is not attempting to win by normal means, the arbiter may award a draw. There is a 2 minute bonus to the opponent if the arbiter disagrees with the player making this claim.
I am not a chess player but I have never heard of referees stopping the game if you don't "give up" in a losing position and have extra time. That sounds ridiculous but I would love to know if it applies in certain tournaments and the reasoning behind it.
You are allowed to keep thinking as long as you have time on your clock. Isn't really considered good sportsmanship but is legal.
Recent instances I saw was an adult was in an almost-lost position with over an hour on the clock while his opponent had 10 minutes. He let his clock run down to nothing and then played quickly before finally let the clock run to zero in a lost (mate in 2) position. He got mocked for this in the local forums.
Also common for a kid to do a blunder and then sit there sad/crying for an hour. You try to encourage them to resign though.
All fun ideas, but power draw and/or hardware cost limits seem essential for a fair game whereas depth limits and no castling are rather just interesting experiments.
Also, time limit doesn't seem that different from a power draw limit if you're a computer.
Given that AlphaZero will presumably never be publicly available, I think you might be interested in TCEC which has fair fights between Stockfish and LeelaChessZero (which Stockfish has won recently).
> I think you might be interested in TCEC which has fair fights between Stockfish and LeelaChessZero
This is pretty questionable in my judgment, actually. TCEC's GPU hardware is 4x Nvidia V100 data center class GPUs, with a pretty powerful processor to boot. A quick search suggests that ONE of these will run you close to $10k, so we're talking about an all-in system worth mid five figures.
Meanwhile, the CPU hardware is pretty dated at this point. They have 4x Intel E5-4669V4, which is from early 2016. It's not easy to find this processor for sale any more (because, again, it's old), but prices seem to run in the $750 - $1500 range if you look on places like Ebay. Meanwhile even on Ebay a V100 is likely to run you $7K+.
I don't know that it's possible to compare "performance" between GPUs and CPUs in a one to one way, but looking at cost, it seems pretty clear that you'd have to spend a lot more to get a system that allows Leela to play at the kind of level you see on TCEC.
Looking at power consumption tells a similar story. Nvidia's data sheet for the V100 shows a maximum power consumption of 250 watts per GPU, so 1000W when running at maximum load (as a chess engine is presumably likely to do). Meanwhile, Intel places the TDP of the E5-4669v4 CPU at 135 watts. Even assuming they're undershooting that by a bit, we're probably talking 600 watts for that system ... on a rather old CPU model.
I'd say it's not a fair comparison. I'm not mad about it, because at the end of the day computer chess tournaments are for entertainment. It's much better if the best neural net programs are competitive with more traditional chess engines, even if by "objective" standards they are weaker.
TCECs goal is to keep the ratios similar to the AlphaZero paper, not price or power or other benchmark. Increasing the hardware of one would require an increase of the hardware of the other. But, the hardware is donated, so it's hard to be too critical.
But why are we comparing used 2021 prices when these hardware weren't purchased in today's market? Especially when GPUs are 1.5-2x MSRP right now. Even very old GPU prices are insane. I recently sold a 980ti near what I purchased it new. 4669v4 MSRP was $7k, so they are not far off. The V100 is pretty dated too as it is from 2017, and doesn't have FP16 which is heavily used by Leela. For this and several other reasons, a single 3090 is actually faster than 4x v100s according to their own bechmarks[1]. A single v100 is approximately equal to a 3080 or 2080ti in performance.
Maybe you should also checkout the CCCC[2] hardware which is even stronger for both: 2x A100 vs 2x AMD EPYC 7H12
I'm inclined to agree. I am more impressed by an engine that plays well in limited hardware than one that plays well on faster/more expensive hardware.
This is one of the reasons Core War was so intriguing; all the programs battling it out were running on the same hardware, each given an even slice of compute time. To win, you must then find ways to do the same amount of work in less time, while keeping your footprint small.
When the day comes (and I think it will, if our civilization lasts long enough) that a computer finally "solves" chess, it will be a momentous achievement, but ultimately boring.
I think TCEC is as close to a fair fight as you're going to get. Obviously, it can't be perfect. A few caveats to your comments (which I mostly agree with). One is that it's easy to add GPUs to a high end machine than CPUs. I'm surprised a 4-socket motherboard (evidently) exists. Secondly, at the consumer level, you can use a consumer GPU with similar processing power as the datacenter GPUs for much cheaper.
No, NNUE doesn't necessarily need tablebases and Leela uses tablebases too. The difference is that their network architecture are different and their search algos are different.
AlphaZero and co only trained via self-play because that was their research goal, and they looked unwinnable because they were the first to get this kind of neural net working, but it doesn't seem like it's the best option.
AZ notably got completely tilted once it started losing, doesn't necessarily recognize strange positions you can't normally get into, and doesn't care about its win margin at all.
It makes sense. They're both open source programs working towards better understanding of chess. There are actually a few people who develop for both. They are obviously very different types of engines, but they are much closer to friendly rivals than enemies.
but.. they're using a pretty big neural net themselves (NNEU) as far as I can tell? With datasets of hundreds of Gigs.
Doesn't this remove the significance of the matchup, that was supposed to be about Deep Learning vs more traditional chess bot methods?
Claiming that something that uses a neural net trained on hundreds of gigs of data isn't deep learning .. I mean it's possible, I don't know the details.
What is it about now, open vs closed source? Different methods of deep learning and big data fighting? (Both of these are also interesting ofc)
NNUE is much smaller than Leela's net, and has a much different architecture that's optimized more for CPU. Additionally, Leela uses Monte Carlo Tree Search and Stockfish uses Alpha/Beta pruning.
> Though definitely not directly comparable, dataset of GPT2-xl is 8 million web-pages.
This is irrelevant. You can train GPT3 on a smaller dataset, or a smaller model on the same dataset as GPT3.
> What I mean to say is that this is clearly deep learning.
It's been clear that neural network models are superior since Alpha Go. There's not "Deep Learning vs <something else>" anymore because the <something else> isn't competitive and no one is really working on it.
Its actually really small, mostly because bigger networks take longer to evaluate which slows down the search making it shallower and ending in a less clever algorithm.
NNUE is a 4 layer (1 input + 3 dense) integer only neural network.
It's just over 82,000 parameters.[1]
That's a very shallow, small NN - by comparison something like EfficientNet-B1[2] is 7.8M parameters, and that's considered a small network.
It's great to have a better engine, but I feel that would benefit the most the online chess community is not a better engine, but an open source cheat detection system, if that's even possible.
I wouldn't know how to build one, but I think that is a lot more important for chess right now, still it's great to have a better engine so congratulations and thank you to the Stockfish team.