Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
whimsicalism
on Sept 17, 2023
|
parent
|
context
|
favorite
| on:
Run LLMs at home, BitTorrent‑style
By the “delta in the LLM weights”, I am assuming you mean the gradients. You are effectively describing large batch training (data parallelism) which is part of the way you can scale up but there are quickly diminishing returns to large batch sizes.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: