Hacker Newsnew | past | comments | ask | show | jobs | submit | cscurmudgeon's favoriteslogin
1.From multi-head to latent attention: The evolution of attention mechanisms (vinithavn.medium.com)
174 points by mgninad 3 months ago | 41 comments
2.We Found Zero Low-Severity Bugs in 165 AI Code Reports. Zero (shamans.dev)
15 points by dmonroy 3 months ago | 14 comments
3.Shamelessness as a strategy (2019) (nadia.xyz)
233 points by wdaher 4 months ago | 165 comments
4.How attention sinks keep language models stable (hanlab.mit.edu)
219 points by pr337h4m 4 months ago | 36 comments
5.Achieving 10,000x training data reduction with high-fidelity labels (research.google)
154 points by badmonster 4 months ago | 29 comments
6.OpenAI's new open-source model is basically Phi-5 (seangoedecke.com)
403 points by emschwartz 4 months ago | 222 comments
7.Open models by OpenAI (openai.com)
2124 points by lackoftactics 4 months ago | 876 comments
8.LLM architecture comparison (sebastianraschka.com)
418 points by mdp2021 5 months ago | 24 comments
9.ETH Zurich and EPFL to release a LLM developed on public infrastructure (ethz.ch)
716 points by andy99 5 months ago | 101 comments
10.Graphical Linear Algebra (graphicallinearalgebra.net)
304 points by hyperbrainer 5 months ago | 26 comments
11.The Moat of Low Status (usefulfictions.substack.com)
397 points by jger15 5 months ago | 167 comments
12.Being too ambitious is a clever form of self-sabotage (maalvika.substack.com)
775 points by alihm 5 months ago | 210 comments
13.Optimizing Tool Selection for LLM Workflows with Differentiable Programming (viksit.substack.com)
122 points by viksit 5 months ago | 42 comments
14.How to not pay your taxes legally, apparently (mrsteinberg.com)
116 points by jimhi 5 months ago | 135 comments
15.How large are large language models? (gist.github.com)
263 points by rain1 5 months ago | 150 comments
16.Q-learning is not yet scalable (seohong.me)
220 points by jxmorris12 6 months ago | 48 comments
17.Large language models often know when they are being evaluated (arxiv.org)
89 points by jonbaer 6 months ago | 130 comments
18.Seven replies to the viral Apple reasoning paper and why they fall short (garymarcus.substack.com)
343 points by spwestwood 6 months ago | 312 comments
19.What was Radiant AI, anyway? (paavo.me)
234 points by paavohtl 6 months ago | 118 comments
20.Focus and Context and LLMs (glek.net)
95 points by tarasglek 6 months ago | 48 comments
21.The last six months in LLMs, illustrated by pelicans on bicycles (simonwillison.net)
962 points by swyx 6 months ago | 234 comments
22.The Illusion of Thinking: Strengths and limitations of reasoning models [pdf] (cdn-apple.com)
488 points by amrrs 6 months ago | 270 comments
23.How much do language models memorize? (arxiv.org)
58 points by mhmmmmmm 6 months ago | 1 comment
24.Deep learning gets the glory, deep fact checking gets ignored (fast.ai)
609 points by chmaynard 6 months ago | 158 comments
25.The Small World of English (inotherwords.app)
157 points by michaeld123 6 months ago | 70 comments
26.A visual exploration of vector embeddings (pamelafox.org)
185 points by pamelafox 6 months ago | 39 comments
27.The Level Design Book (leveldesignbook.com)
324 points by keiferski 6 months ago | 58 comments
28.AI Hallucination Legal Cases Database (damiencharlotin.com)
86 points by Tomte 6 months ago | 46 comments
29.Attention Wasn't All We Needed (stephendiehl.com)
130 points by mooreds 6 months ago | 24 comments
30.Beyond Semantics: Unreasonable Effectiveness of Reasonless Intermediate Tokens (arxiv.org)
138 points by nyrikki 6 months ago | 66 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: