Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They say that the dataset is hundreds of Gigs worths of games, so the net must still be really pretty big.

Though definitely not directly comparable, dataset of GPT2-xl is 8 million web-pages. What I mean to say is that this is clearly deep learning.



> They say that the dataset is hundreds of Gigs worths of games, so the net must still be really pretty big.

This isn't true. The size of the training data doesn't imply anything about the size of the neural network.

In the case of Stockfish, the NN is quite shallow, and implemented using a custom framework designed to to run fast on CPUs.

See https://news.ycombinator.com/item?id=26746160 for previous commentary on this.

> Though definitely not directly comparable, dataset of GPT2-xl is 8 million web-pages.

This is irrelevant. You can train GPT3 on a smaller dataset, or a smaller model on the same dataset as GPT3.

> What I mean to say is that this is clearly deep learning.

It's been clear that neural network models are superior since Alpha Go. There's not "Deep Learning vs <something else>" anymore because the <something else> isn't competitive and no one is really working on it.


Its actually really small, mostly because bigger networks take longer to evaluate which slows down the search making it shallower and ending in a less clever algorithm.


Are you involved in the project ? Can I ask what your source is? Great if it's the case.


NNUE is a 4 layer (1 input + 3 dense) integer only neural network.

It's just over 82,000 parameters.[1] That's a very shallow, small NN - by comparison something like EfficientNet-B1[2] is 7.8M parameters, and that's considered a small network.

[1] https://www.chessprogramming.org/Stockfish_NNUE#NNUE_Structu...

[2] https://proceedings.mlr.press/v97/tan19a/tan19a.pdf


I am involved in lc0 development and fairly aware of SF dev. NNUE is a very small (3 layer dense) cpu only net.


Size of training set is not enough to make it deep learning, right? Doesn't deep learning imply at least one hidden layer?


Are you saying you read that it didn't have a hidden layer?

My point is that having such a huge dataset would not be extremely useful without using a deep neural net (of at least one hidden layer)


NNs without at least one hidden layer are rarely used.


They're used all the time, we just call it logistic regression.


You can have a relatively small model and still benefit from using a gigantic training set to train the model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: