>This would be akin to a robot being given access to thousands of metal bits and parts, but no knowledge of a combustion engine, then it experiments numerous times with every combination possible until it builds a Ferrari.
No, not even close. First, AlphaZero started with the rules of chess, chess pieces, and a chess board. Second, the possible moves are several orders of magnitude fewer than the steps needed to build a working car out of parts.
Closer would be: Here's a car. Here are all the tuneable parameters. Make it as fast or as efficient as possible. But that would still be inordinately more complex then groking chess.
I'm surprised how slow the press has been to pick this up. This seems like an amazing step forward to me. AlphaZero played only against itself as training and it beat one of the best chess AIs in the world that has been finely tuned with decades worth of human knowledge.
Now that Go and Chess are efficiently solved for AI...what's next? Are there any other interesting complete information games remaining? What's the next milestone for incomplete information games?
I have to say it’s a completely inappropriate comparison to make since Stockfish was forbidden from using its opening book. I would like to see the results when both are at their best.
Looks like a residual network (ResNet) feeding a Monte Carlo tree search (MCTS) solves the strategy optimization problem.
A critique is that the game model (rules, pieces, movements, legal moves) is still bespoke and painstakingly created by a human being. One next step would be for an algorithm that develops the game model as well as the strategy and the I/O translation. E.g., use Atari 2600 Video Chess frame grabs as the input and the Atari controller as the output. After experimentation the algorithm creates everything: the game model (chess, checkers, shogi, go), the strategy for the game, and the I/O processing needed to effect the strategy with the available inputs and outputs.
Some in the chess community seems to be still in denial phase.
The Go/Baduk community has experienced the similar thing early last year.
- Jan 2016 (AlphaGo beat Fan Hui): Fan Hui was only a 2p and European champion, he's no way near the top
- Mar 2016 (AlphaGo beats Lee Sedol): AlphaGo still lost 1 game. The #1 rank player can probably still beat it
- Jan 2017 (AlphaGo Master beats 60 pros): Ok, AlphaGo is strong, but those are only online games with short time control.
- May 2017 (AlphaGo Master beats #1 ranking, Ke Jie): ...
- Oct 2017 (AlphaGo Zero beats AlphaGo Master): Ok, nothing we have right now can probably beat it.
Can any computer chess/stockfish experts comment on the choice of 1 GB for hash size? I have no chess or computer chess domain expertise whatsoever, but it strikes me as a suspiciously low setting for a memory-related parameter on what was probably at least a 32 core machine. It makes me wonder if they simply dialed down the hash size until they got a nice convincing win.
Update: took a look at settings used for TCEC. Looks like they used 16 GB in season 7, 64GB in season 8, 32 GB in season 9, and 16 GB in season 10. Two observations: (1) interesting that they've decreased hash sizes in recent years (2) definitely seems like 1 GB is not reflective of how an engine would be configured for TCEC.
Question for the AGI believers out there:
We have this, state of the art, AI which can turn the screws and hone in on some underlying reality about “how to win at Chess”, a formal game. Great.
How does this then extend into the social domain, where AGI would be operating? Like, how does AlphaZero optimize for “how to slow Climate Change”?
I can’t even fathom how it would even understand climate change without an army of scientists publishing new work for it to consume. And then on top of that it will need to understand how it’s adversary, Putin, will try to optimize for the opposite, “ensure global warming to open up our shipping routes and arable land”.
It just seems like a non-starter to me. Saying you could win at chess, so winning at geopolitics is just a scaling problem to me is like saying I can drink a bowl of miso soup so drinking the ocean is just a scaling problem.
It would seem to me that intelligence at the highest levels isn’t constrained by foreknowledge, it’s constrained by the consequences of past decisions made during the (inevitably ongoing) interactive learning phase.
AlphaZero and Stockfish did not run on the same hardware.
So it's not clear if the algorithm is better or the algorithm was just run faster.
You will see here how Stockfish improvements are tested.
Some code or weight is changed, then they have it play to see if it leads to better performance.
An exhaustive, sort of manual work. On the other hand Deepmind's bot is fully automated and they have it just running day and night improving itself on a large hardware configuration.
Yesterday's discussion here: item?id=15858197
Amazing. I assume AlphaZero knew the basic moves of the pieces, but had to figure out defensive moves etc. against the other computer 'on the fly'? Are those learning games included in the statistics (which include ZERO losses)?? If so, it is a remarkable learning engine.
Shades of WOPR. "This is a game nobody can win..."
> Nielsen is eager to see what other disciplines will be refined or mastered by this type of learning
> of course it goes so much further
> The ramifications for such an inventive way of learning are of course not limited to games.
>But obviously the implications are wonderful far beyond chess and other games. The ability of a machine to replicate and surpass centuries of human knowledge in complex closed systems is a world-changing tool
Okay then. Let's go beyond games already!
While interesting, it's comparing 5000+ Tensor Processing Units against 64 CPU threads. I suspect this isn't a fair comparison by watts spent.
>This would be akin to a robot being given access to thousands of metal bits and parts, but no knowledge of a combustion engine, then it experiments numerous times with every combination possible until it builds a Ferrari. That's all in less time that it takes to watch the "Lord of the Rings" trilogy. The program had four hours to play itself many, many times, thereby becoming its own teacher.
This absurd comparison would raise my eyebrows coming from an English tabloid.
Having said that I've looked the author's profile and was appalled to learn he's a chess prodigy. Then I also seen he's a Chess Journalist. Apparently he became much more a journalist than a chess master...
What I'm reading: an existing chess engine that runs on much poorer hardware and for some weird reason was deprived of its usual initialization data achieved 73% draw rate against a ridiculously hyped "deep" neural network/MCTS algorithm.
It's interesting that AlphaZero was finally applied to a different game, though. I wonder what architectural changes they had to make. I've read that pure MCTS isn't that good at playing Chess. How true is that?
I'd be curious about how it's strength relative to Stockfish might change as the amount of time per move is varied.
Personally I was always interested if it's possible now to use them (NN, DL, etc.) to infer theorems on it's own. Because if the difference between it and a human would be as big as we see in these expert-systems (trained only to do one thing) then it could provide amazing results.
Did DeepMind publish anything about this? Is this literally a straightforward plug of the AlphaGo Zero techniques into a chessboard with no novelty? Don't get me wrong, I'm impressed, I'm just looking for a more primary source.
This is of course all very, very impressive, but it would be great to see more details on this. We are told AZ only started with the basic rules. What was included in the "basic rules"? How were they codified? The engine looks at 80.000 positions at second, so obviously it has some evaluation function. What is the position evaluation function? Presumably it was codified in some way in the beginning, and then got improved by the training period? It would be very interesting to see the first 100 games, or so, the engine it played against itself.
I think the real measure of an AIs success in this field is the absence of pathological boards like this one.
Is it still easy to find positions that alphazero totally misunderstands?
Does anyone know how well a system like AlphaZero can be applied to a field like material science? It would seem that you could make a scoring function against how well the material meets the desired criteria.
If AZ is truly superior to Stockfish, why was Stockfish given an amount of RAM that could only be considered standard for the 1990s?
Completely dishonest title and first 1,000 words of the write-up. That's when you get to the words:
> pointed out that Stockfish's methodology requires it to have an openings book for optimal performance.
I went from amazed, utter shock like "What!! No way. This is unreal. This is absolutely unreal. What? What? What?" to a total feeling that I've been reading 1,000 words of fake news.
I feel cheated by this write-up and flagged it for this reason. They need to mention it in the first couple of words, not after selling that it has deduced 1600 years of human chess knowledge in 4 hours.
It's interesting to consider what it means when the AI can succeed without using brute force.
Suppose at every turn there are n possible future states of the game based on the rules. To avoid "brute force" the AI must be able to ignore many of those states as irrelevant. In effect, the AI is learning what to pay attention to, not just considering what might happen, thereby conserving computational resources.
Chess and Go are interesting for two nearly opposite reasons: 1) because they are too large for humans to consider the reasoning obvious, and 2) because the input to the reasoning is simply a small (and easily perceived by humans) grid of rule-constrained pieces.
But when you think of AI in an information theoretic way, so that given representative training data the system (if large enough) will always "learn" perfectly, it's not really all that remarkable. It's just a different computational way of doing the same transformation from input states to moves. Given a problem (chess, go, etc.) the researchers must simply learn what network structure and training regimen will do the job with the least computational cost.
To see why this is relevant, consider a deep learning model that could continually generate successive digits of pi (or primes) without having the concept baked in already. Would the result be computationally cheaper than highly optimized brute force algorithm? No, because what it would "learn" would be something already known by humans. Perfect chess is simply a function from input states to moves that humans do not already know the definition of. Most humans do know the definition of this function for the game of tic tac toe by the time they reach middle school.
I'd argue that while this is useful it's ultimately not hard. Comparing it with Stockfish mainly demonstrates how chess is hard for humans to reason about and hence hard for humans to write non-brute-force algorithms to solve.
Thus, I think this is an example of "weak AI" even though humans associate chess with high degrees of exceptional human cognition. Chess data contains no noise, so the algorithm is dealing only with signals of varying degrees of utility.
I'm looking forward to AI that can be useful in the midst of lots of noise, such as AI that analyzes peoples' body language to predict interesting things about them, analyzes speech in real time for deception, roulette wheels for biases, and office environments for emotional toxicity.
Chess is interesting because we can't introspect to understand what makes humans good at chess (other than practice). So many human insights and intuitions are similarly opaque yet the data is noisy enough that it will take significantly better AI to be able to do anything that truly seems super-human.
Number of humans that could put together a Ferrari from parts: ~10000?
Number of humans that can beat Stockfish: 0
But now please Houdini and the rest. Stockfish doesn't play interesting, and has less ELO than Houdini. Though that new search will beat Houdini also I guess.
seems like the real untapped potential goldmine here is to carefully observe millions of workers, notice when they are working and why and when they are slacking off- and enforce penalties and rewards relative to their peak performance to motivate increased productivity. measure average key presses, response times, eye engagement, fidgeting and body motion, facial expressions. do you really need robots if you can train the human network to be more robot like? the 21st century assembly line is so delicious, think of the possibilities.
going to need stimulants to work at 100% or your company AI will cut your pay.
Four hours to learn on a high speed computer is what millions upon millions of games? Thousands of human lifetimes lived out in four hours. The four hours to learn thing is fake news and distorts the reality of how many games and trials the machine actually went through.
Another really impressive feat is Elon Musk's OpenAI which defeated a number of world class dota2 players in a 1v1 match.
This is a real time strategy video game. The number of decisions you need to make in this game are mind boggling. It takes most people many months, if not longer just to get to the point where you're not clueless.
A recap video is at: https://www.youtube.com/watch?v=jAu1ZsTCA64