The nature of the basic research beast. There are grad student written astrophysics/comp chemistry spaghetti codes that continue to get big funding for the sole reason (it feels like) that they scale huge and eat up DOE supercomputing time "look how fast (we burn money)". Maybe a hot take.
Indeed, and I think natural language and reasoning will have some kind of geometric properties as well. Attention is just a sledgehammer that lets us brute force our way around not understanding that structure well. I think the next step change in AI/LLM abilities will be exploiting this geometry somehow [1,2].
QM would tell us the order of your Hamiltonian (attention operator) doesn’t limit the complexity of the wave function (hidden state). It might be more efficient to explicitly correlate certain many-body interactions, but pair-wise interactions, depth and a basis (hidden state dimension) approaching completeness "are all you need”.
The terminology is overloaded.. Tensors in QM are objects obeying transformation laws, in ML Tensors are just data arranged in multidimensional arrays. There are no constraints on how the data transforms.
Intended as analogy - but it is essentially a description of the DMRG algorithm (quantum chem). Only pair-wise operators there but the theory approaches exact when there are enough terms in your tensor product (iterations ~ depth) and a large enough embedding dimension.
> There are no constraints on how the data transforms.
Except those implicit in your learned representation. And that representation could be the MB WF.
How boxed in is cosmology by the cosmological principle? If we - as an example - didn't assume the cosmological constant was constant and expected it to vary over large distances, could we arrive at a working model of the universe? maybe high density dark matter/energy regions are the same as regions of high/low values of the CC. It's late.
reply