More

veunes · 2025-12-05T12:58:05 1764939485

This is the inevitable evil of the man in the middle. OpenRouter by definition decrypts your traffic to route it to the provider (OpenAI, Anthropic). Technically, they can read everything The problem is that for the Enterprise segment, this is a showstopper. No bank or hospital will route data through an aggregator that openly states it classifies prompts via Google API (even sampled ones). This confirms that OpenRouter remains a tool for indie hackers and researchers, not for serious B2B

veunes · 2025-12-05T12:50:03 1764939003

I'm not surprised. Roleplay means endless sessions with huge context (character history, world, previous dialogues). On commercial APIs (OpenAI/Anthropic), that long-context costs a fortune. On OpenRouter, many OSS models, especially via providers like DeepInfra or Fireworks, cost pennies or are even free, like some Free-tier models. The RP community is very price-sensitive, so they massively migrate to cheap OSS models via aggregators. It skews the stats but highlights a real niche for cheap inference

veunes · 2025-12-05T12:36:47 1764938207

Yeah, using an API aggregator to run a 7B model is economically strange if you have even a consumer GPU. OpenRouter captures the cream of complex requests (Claude 3.5, o1) that you can't run at home. But even for local hosting, medium models are starting to displace small ones because quantization lets you run them on accessible hardware, and the quality boost there is massive. So the "Medium is the new Small" trend likely holds true for the self-hosted segment as well.

veunes · 2025-12-05T12:24:10 1764937450

The 4x growth in prompt length is a fundamental shift. We've quickly moved from "Q&A" mode to "upload full context and analyze" mode.

This completely changes infrastructure requirements: KV-caching becomes a necessity, and prefill time becomes a critical metric, often more important than generation speed. That's exactly why models with cheap long context (Gemini, DeepSeek) are winning the race against "smarter" but expensive models. Inference economics are now dictated by context length

veunes · 2025-12-04T17:19:57 1764868797

That engine had character

veunes · 2025-12-04T17:19:07 1764868747

What's tricky is that even tiny improvements in fuel economy or emissions can justify a redesign when you're building at scale

veunes · 2025-12-04T17:18:02 1764868682

Plenty of engines hit the 20-year mark, but not many do it while still powering everything from family sedans to track-ready Type Rs

1970-01-01 · 2025-12-04T18:51:39 1764874299

The V6 2GR is about to do it for Toyota. Subaru had the EJ going strong for 20 years. Honda has this. Nissan had a few contenders.. there's a list: https://en.wikipedia.org/wiki/Ward%27s_10_Best_Engines

veunes · 2025-12-04T17:16:39 1764868599

I think the article's broader point still stands: Honda built a platform with enough foresight and flexibility that it could be continuously refined rather than scrapped and replaced every few years

Aurornis · 2025-12-04T20:16:10 1764879370

True, but this is actually common for engine manufacturers. Most manufacturers have an engine series that spans multiple decades.

> rather than scrapped and replaced every few years

This doesn’t really happen for mainstream engines. Maybe for specialty and exotic engines, but not the engines you see powering the commuter cars and trucks on the road. Engine development is expensive. Nobody is scrapping and replacing their bread and butter engine design every few years.

veunes · 2025-12-04T17:15:02 1764868502

What's impressive isn't just the longevity, but how gracefully it's evolved over two decades while still feeling relevant in today's turbocharged, emissions-strangled landscape

veunes · 2025-11-28T11:03:01 1764327781

The own shovels for own mines strategy has a hidden downside: isolation. NVIDIA sells shovels to everyone - OpenAI, Meta, xAI, Microsoft - and gets feedback from the entire market. They see where the industry is heading faster than Google, which is stewing in its own juices. While Google optimizes TPUs for current Google tasks (Gemini, Search), NVIDIA optimizes GPUs for all possible future tasks. In an era of rapid change, the market's hive mind usually beats closed vertical integration.