cscurmudgeon's favorites

1.		From multi-head to latent attention: The evolution of attention mechanisms (vinithavn.medium.com)
		174 points by mgninad 3 months ago \| 41 comments
2.		We Found Zero Low-Severity Bugs in 165 AI Code Reports. Zero (shamans.dev)
		15 points by dmonroy 3 months ago \| 14 comments
3.		Shamelessness as a strategy (2019) (nadia.xyz)
		233 points by wdaher 4 months ago \| 165 comments
4.		How attention sinks keep language models stable (hanlab.mit.edu)
		219 points by pr337h4m 4 months ago \| 36 comments
5.		Achieving 10,000x training data reduction with high-fidelity labels (research.google)
		154 points by badmonster 4 months ago \| 29 comments
6.		OpenAI's new open-source model is basically Phi-5 (seangoedecke.com)
		403 points by emschwartz 4 months ago \| 222 comments
7.		Open models by OpenAI (openai.com)
		2124 points by lackoftactics 4 months ago \| 876 comments
8.		LLM architecture comparison (sebastianraschka.com)
		418 points by mdp2021 5 months ago \| 24 comments
9.		ETH Zurich and EPFL to release a LLM developed on public infrastructure (ethz.ch)
		716 points by andy99 5 months ago \| 101 comments
10.		Graphical Linear Algebra (graphicallinearalgebra.net)
		304 points by hyperbrainer 5 months ago \| 26 comments
11.		The Moat of Low Status (usefulfictions.substack.com)
		397 points by jger15 5 months ago \| 167 comments
12.		Being too ambitious is a clever form of self-sabotage (maalvika.substack.com)
		775 points by alihm 5 months ago \| 210 comments
13.		Optimizing Tool Selection for LLM Workflows with Differentiable Programming (viksit.substack.com)
		122 points by viksit 5 months ago \| 42 comments
14.		How to not pay your taxes legally, apparently (mrsteinberg.com)
		116 points by jimhi 5 months ago \| 135 comments
15.		How large are large language models? (gist.github.com)
		263 points by rain1 5 months ago \| 150 comments
16.		Q-learning is not yet scalable (seohong.me)
		220 points by jxmorris12 6 months ago \| 48 comments
17.		Large language models often know when they are being evaluated (arxiv.org)
		89 points by jonbaer 6 months ago \| 130 comments
18.		Seven replies to the viral Apple reasoning paper and why they fall short (garymarcus.substack.com)
		343 points by spwestwood 6 months ago \| 312 comments
19.		What was Radiant AI, anyway? (paavo.me)
		234 points by paavohtl 6 months ago \| 118 comments
20.		Focus and Context and LLMs (glek.net)
		95 points by tarasglek 6 months ago \| 48 comments
21.		The last six months in LLMs, illustrated by pelicans on bicycles (simonwillison.net)
		962 points by swyx 6 months ago \| 234 comments
22.		The Illusion of Thinking: Strengths and limitations of reasoning models [pdf] (cdn-apple.com)
		488 points by amrrs 6 months ago \| 270 comments
23.		How much do language models memorize? (arxiv.org)
		58 points by mhmmmmmm 6 months ago \| 1 comment
24.		Deep learning gets the glory, deep fact checking gets ignored (fast.ai)
		609 points by chmaynard 6 months ago \| 158 comments
25.		The Small World of English (inotherwords.app)
		157 points by michaeld123 6 months ago \| 70 comments
26.		A visual exploration of vector embeddings (pamelafox.org)
		185 points by pamelafox 6 months ago \| 39 comments
27.		The Level Design Book (leveldesignbook.com)
		324 points by keiferski 6 months ago \| 58 comments
28.		AI Hallucination Legal Cases Database (damiencharlotin.com)
		86 points by Tomte 6 months ago \| 46 comments
29.		Attention Wasn't All We Needed (stephendiehl.com)
		130 points by mooreds 6 months ago \| 24 comments
30.		Beyond Semantics: Unreasonable Effectiveness of Reasonless Intermediate Tokens (arxiv.org)
		138 points by nyrikki 6 months ago \| 66 comments
		More