More

3836293648 · 2026-02-09T21:49:21 1770673761

Not for years. If that is still the case for you, ask your server hosts to update to a version that supports ircv3

3836293648 · 2026-02-09T18:39:57 1770662397

It absolutely theoretically can, but afaik neither V8 or the JVM can actually do it to a level where it outperforms the static optimisations of GCC or LLVM.

Is this still the case or am I going on outdated info on the matter?

3836293648 · 2026-02-05T19:02:45 1770318165

Oh come on. It's massively wrong. It is always wrong. It's not always wrong enough to be important, but it doesn't stop being wrong

vntok · 2026-02-05T20:19:13 1770322753

You should elaborate. What are your criteria and why do you think they should matter to actual users?

jama211 · 2026-02-06T04:32:59 1770352379

No, it’s not.

3836293648 · 2026-02-05T19:00:52 1770318052

The reports I remember show that they're profitable per-model, but overlap R&D so that the company is negative overall. And therefore will turn a massive profit if they stop making new models.

schnable · 2026-02-05T20:59:28 1770325168

* stop making new models and people keep using the existing models, not switch to a competitor still investing in new models.

trcf23 · 2026-02-05T19:26:16 1770319576

Doesn’t it also depend on averaging with free users?

3836293648 · 2026-02-02T17:45:29 1770054329

There is discussion about this in the Rust world, though no attempts at implementation (and yet further from stabilisation)

3836293648 · 2026-01-29T23:51:56 1769730716

Maybe if you owned the tree, not if someone a few houses down does

3836293648 · 2026-01-25T11:03:53 1769339033

This is in response to all the pushback they got from that

3836293648 · 2026-01-23T09:50:03 1769161803

But this also means tiny context windows. You can't fit gpt-oss:20b + more than a tiny file + instructions into 24GB

blizdiddy · 2026-01-24T06:37:03 1769236623

Gpt-oss is natively 4-bit, so you kinda can

3836293648 · 2026-01-25T19:04:36 1769367876

You can fit the weights + a tiny context window into 24GB, absolutely. But you can't fit anything of any reasonable size. Or Ollama's implementation is broken, but it needs to be restricted beyond usability for it not to freeze up the entire machine when I last tried to use it.

3836293648 · 2026-01-19T08:17:49 1768810669

LLMs do typically encode a confidence level in their embeddings, they just never use it when asked. There were multiple papers on this a few years back and they got reasonable results out of it. I think it was in the GPT3.5 era though

3836293648 · 2026-01-19T08:15:04 1768810504

It's mostly AI slop, but they did exist before AI (and they were miserable back then too)