Fwiw, I tried multiple global tokens in my chess neural net and didn't see any uplift compared to my baseline of just having one.
Fwiw, I tried multiple global tokens in my chess neural net and didn't see any uplift compared to my baseline of just having one.