Importantly, though, LLMs do not take the embeddings as input during training; t... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		andrewla 11 months ago \| parent \| context \| favorite \| on: Deep Learning Is Not So Mysterious or Different Importantly, though, LLMs do not take the embeddings as input during training; they take the tokens and learn the embeddings as part of the training. Specifically all Transformer-based models; older models used things like word2vec or elmo, but all current LLMs train their embeddings from scratch.

naasking 11 months ago [–]

And tokens are now going down to the byte level:

https://ai.meta.com/research/publications/byte-latent-transf...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact