Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
cscurmudgeon's favorites
login
submissions
|
comments
1.
From multi-head to latent attention: The evolution of attention mechanisms
(
vinithavn.medium.com
)
174 points
by
mgninad
3 months ago
|
41 comments
2.
We Found Zero Low-Severity Bugs in 165 AI Code Reports. Zero
(
shamans.dev
)
15 points
by
dmonroy
3 months ago
|
14 comments
3.
Shamelessness as a strategy (2019)
(
nadia.xyz
)
233 points
by
wdaher
4 months ago
|
165 comments
4.
How attention sinks keep language models stable
(
hanlab.mit.edu
)
219 points
by
pr337h4m
4 months ago
|
36 comments
5.
Achieving 10,000x training data reduction with high-fidelity labels
(
research.google
)
154 points
by
badmonster
4 months ago
|
29 comments
6.
OpenAI's new open-source model is basically Phi-5
(
seangoedecke.com
)
403 points
by
emschwartz
4 months ago
|
222 comments
7.
Open models by OpenAI
(
openai.com
)
2124 points
by
lackoftactics
4 months ago
|
876 comments
8.
LLM architecture comparison
(
sebastianraschka.com
)
418 points
by
mdp2021
5 months ago
|
24 comments
9.
ETH Zurich and EPFL to release a LLM developed on public infrastructure
(
ethz.ch
)
716 points
by
andy99
5 months ago
|
101 comments
10.
Graphical Linear Algebra
(
graphicallinearalgebra.net
)
304 points
by
hyperbrainer
5 months ago
|
26 comments
11.
The Moat of Low Status
(
usefulfictions.substack.com
)
397 points
by
jger15
5 months ago
|
167 comments
12.
Being too ambitious is a clever form of self-sabotage
(
maalvika.substack.com
)
775 points
by
alihm
5 months ago
|
210 comments
13.
Optimizing Tool Selection for LLM Workflows with Differentiable Programming
(
viksit.substack.com
)
122 points
by
viksit
5 months ago
|
42 comments
14.
How to not pay your taxes legally, apparently
(
mrsteinberg.com
)
116 points
by
jimhi
5 months ago
|
135 comments
15.
How large are large language models?
(
gist.github.com
)
263 points
by
rain1
5 months ago
|
150 comments
16.
Q-learning is not yet scalable
(
seohong.me
)
220 points
by
jxmorris12
6 months ago
|
48 comments
17.
Large language models often know when they are being evaluated
(
arxiv.org
)
89 points
by
jonbaer
6 months ago
|
130 comments
18.
Seven replies to the viral Apple reasoning paper and why they fall short
(
garymarcus.substack.com
)
343 points
by
spwestwood
6 months ago
|
312 comments
19.
What was Radiant AI, anyway?
(
paavo.me
)
234 points
by
paavohtl
6 months ago
|
118 comments
20.
Focus and Context and LLMs
(
glek.net
)
95 points
by
tarasglek
6 months ago
|
48 comments
21.
The last six months in LLMs, illustrated by pelicans on bicycles
(
simonwillison.net
)
962 points
by
swyx
6 months ago
|
234 comments
22.
The Illusion of Thinking: Strengths and limitations of reasoning models [pdf]
(
cdn-apple.com
)
488 points
by
amrrs
6 months ago
|
270 comments
23.
How much do language models memorize?
(
arxiv.org
)
58 points
by
mhmmmmmm
6 months ago
|
1 comment
24.
Deep learning gets the glory, deep fact checking gets ignored
(
fast.ai
)
609 points
by
chmaynard
6 months ago
|
158 comments
25.
The Small World of English
(
inotherwords.app
)
157 points
by
michaeld123
6 months ago
|
70 comments
26.
A visual exploration of vector embeddings
(
pamelafox.org
)
185 points
by
pamelafox
6 months ago
|
39 comments
27.
The Level Design Book
(
leveldesignbook.com
)
324 points
by
keiferski
6 months ago
|
58 comments
28.
AI Hallucination Legal Cases Database
(
damiencharlotin.com
)
86 points
by
Tomte
6 months ago
|
46 comments
29.
Attention Wasn't All We Needed
(
stephendiehl.com
)
130 points
by
mooreds
6 months ago
|
24 comments
30.
Beyond Semantics: Unreasonable Effectiveness of Reasonless Intermediate Tokens
(
arxiv.org
)
138 points
by
nyrikki
6 months ago
|
66 comments
More
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: