Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
segmondy
7 months ago
|
parent
|
context
|
favorite
| on:
Qwen3-Coder: Agentic coding in the world
beg to differ, I'm living fine with 1.5tk/sec
danielhanchen
7 months ago
[–]
Spec decoding on a small draft model could help increase it by say 30 to 50%!
segmondy
7 months ago
|
parent
[–]
i'm not willing to trade any more quality for performance. no draft, no cache for kv either. i'll take the performance cost, it just makes me think carefully about my prompt. i rarely every need more than one prompt to get my answers. :D
jychang
7 months ago
|
root
|
parent
|
next
[–]
Speculative decoding doesn't change output tokens.
zackangelo
7 months ago
|
root
|
parent
|
prev
[–]
Draft model doesn’t degrade quality!
segmondy
7 months ago
|
root
|
parent
[–]
I beg to differ, especially when it comes to code.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: