More

Jaxkr · 2025-05-07T04:13:34 1746591214

That’s what this is. It’s caching the state of the model after the tokens have been loaded. Reduces latency and cost dramatically. 5m TTL on the cache usually.

cal85 · 2025-05-07T08:24:08 1746606248

Interesting! I’m wondering, does caching the model state mean the tokens are no longer directly visible to the model? i.e. if you asked it to print out the input tokens perfectly (assuming there’s no security layer blocking this, and assuming it has no ‘tool’ available to pull in the input tokens), could it do it?

saagarjha · 2025-05-07T08:49:36 1746607776

The model state encodes the past tokens (in some lossy way that the model has chosen for itself). You can ask it to try and, assuming its attention is well-trained, it will probably do a pretty good job. Being able to refer to what is in its context window is an important part of being able to predict the next token, after all.

noodletheworld · 2025-05-07T08:51:05 1746607865

It makes no difference.

Theres no difference between feeding an LLM a prompt and feeding it half the prompt, saving the state, restoring the state and feeding it other half of the prompt.

Ie. The data processed by the LLM is prompt P.

P can be composed of any number of segments.

Any number of segments can be cached, as long as all preceeding segments are cached.

The final input is P, regardless.

So; tldr; yes? Anything you can do with a prompt you can do, becasue its just a prompt.

chpatrick · 2025-05-07T12:46:39 1746621999

Isn't the state of the model exactly the previous generated text (ie. the prompt)?

int_19h · 2025-05-07T20:07:08 1746648428

When the prompt is processed, there is an internal key-value cache that gets updated with each token processed, and is ultimately used for inference of the new token. If you process the prompt first and then dump that internal cache, you can effectively resume prompt processing (and thus inference) from that point more or less for free.

https://medium.com/@plienhar/llm-inference-series-3-kv-cachi...

Jaxkr · 2025-05-02T16:40:58 1746204058

Our metabolism is amazing and complex but CICO is a matter of thermodynamics.

People get burned when measuring the “out” part because our bodies want to hold on to as much energy as possible.

jghn · 2025-05-02T16:54:05 1746204845

Not just out, both directions can be tricky to measure. It is hard to say for certain how many potential kcal you're consuming are actually absorbed by the body. If you see whole corn kernels in the toilet, those kcal didn't count :)

But yes. CICO is and always has been absolutely true. People are just overly reductive in how they measure both sides, and then claim that CICO is garbage.

nahkoots · 2025-05-02T17:02:01 1746205321

Often, the whole kernels you see in the toilet aren't actually filled with corn.

swat535 · 2025-05-03T02:05:07 1746237907

In my experience, the most reliable way to understand your body's calorie needs is through consistent measurement:

1. Log everything you eat each day.

2. Weigh yourself first thing the next morning, before eating.

3. Track the trend (did you gain, lose, or maintain?)

Over time, clear patterns emerge. You start to see exactly how your intake maps to weight changes, and you can fine-tune accordingly. It’s not guesswork, it’s feedback.

What surprised me most was how little food I actually needed. Even with regular strength training, a modest surplus was enough to support muscle growth.

themaninthedark · 2025-05-03T02:41:28 1746240088

From the fine article: >"People often lose muscle and gain body fat as they age—even when their body weight remains the same,"

mariusor · 2025-05-03T08:20:13 1746260413

Aren't calorie numbers on foods just made up numbers anyway? I'm no expert but I'm pretty sure that a body's method of metabolizing food is not the same as oxygen burning it. They might offer a standardized number, and a basis for comparison, but other than that it's not reflective of anybody's reality.

dzhiurgis · 2025-05-03T09:22:13 1746264133

You need to get orders of magnitude right at first. I find keeping a punned tab with any AI works pretty well. Drop 5 words with every meal or snack and thats it.

ses1984 · 2025-05-02T17:05:42 1746205542

My brother: those kernels are filled with poop.

It’s ok to assume that you absorb 100% of what you eat, unless you see evidence to the contrary, and no corn kernel poop doesn’t count. Frequent diarrhea, weight loss, skin rash, and basically any symptom of vitamin or mineral deficiency.

DebtDeflation · 2025-05-02T17:35:17 1746207317

>It’s ok to assume that you absorb 100% of what you eat

That's not really true. If you've ever done the keto diet, you know that your body expels unburned ketones through your breath, sweat, and urine. Protein can be used to repair structures rather than burned or stored for energy.

There's also something called the "thermic effect of feeding". Your body requires more energy to process protein (20-30% of calories consumed) than it does carbs (5-10%) than it does fats (0-3%).

codingdave · 2025-05-02T18:05:40 1746209140

> you know that your body expels unburned ketones

In other words, "calories out".

anonymars · 2025-05-02T22:46:57 1746226017

Great, but parent is responding to

> It’s ok to assume that you absorb 100% of what you eat

ses1984 · 2025-05-03T12:31:42 1746275502

It didn’t get converted to ketones and expelled in your poop.

Your gut absorbed it. That is literally what digestive absorption is about. Expelling ketones is not digestive malabsorption.

anonymars · 2025-05-04T21:37:36 1746394656

There are many ways for food to not be 100% absorbed, which I think can most easily be demonstrated by eating a bag of nuts and waiting a day or two

I don't think it's unreasonable to think that different bodies absorb food in different ways (or proportions), particularly given what we've seen about the gut microbiome

Apparently even just changing the time of food intake can affect obesity (in rats) if this study is to be believed: https://pubmed.ncbi.nlm.nih.gov/22608008/

In response to, "but those are rats", I think it's a lot easier to cast doubt on "100% of food is always absorbed" vs "I don't think that always holds true"

I mean, heck: if there are no residual calories in human waste, how can it burn?

ses1984 · 2025-05-05T16:41:45 1746463305

My original point: it's ok to assume you absorb 100%.

About the rat thing: the cico hypothesis point of view might look at whether meal timing affecting energy expenditure first, rather than assuming meal timing change digestive absorption.

There is not much point in getting in the weeds about how much you absorb, unless you're running trials on yourself like changing when you eat, or what you eat, and leaving all other things equal like calorie intake and expenditure.

The best dieting strategies I've seen track calories in and weight change. From their you derive calorie expenditure, and it really doesn't matter if you burned it or pooped it out, does it?

alecst · 2025-05-02T17:49:42 1746208182

Here's evidence to the contrary:

https://www.latimes.com/science/story/2020-02-04/calorie-cou...

ses1984 · 2025-05-03T12:36:22 1746275782

It’s not like almonds have x calories for a certain group and y calories for another.

Being wrong about the number of calories in almonds doesn’t count as evidence that skinny people are skinny because they poop out undigested calories.

Also, I’m not saying digestive malabsorption is impossible, just that you shouldn’t assume it unless you have strong evidence to the contrary that doesn’t have another simpler explanation.

matthewdgreen · 2025-05-02T18:36:12 1746210972

CICO is more of an upper bound, but people like to incorrectly use it as a lower bound. Meaning that it's true that you can't burn more calories than you eat, but you can certainly eat many more calories than you store as fat.

(Hell, CICO isn't even valid for something as "simple" as an electric vehicle. My EV's end-to-end efficiency is quite a bit different depending on whether I'm charging from 120V or 240V, the outside temperature at charging time, the outside temperature at driving time, and a handful other factors like state-of-charge. The human body is even more complicated.)

Jaxkr · 2025-04-20T18:51:35 1745175095

Markdown was designed to compile to HTML and web browsers are designed to render HTML. So I’d say a browser is the only reasonable choice.

Jaxkr · 2025-04-19T16:06:32 1745078792

Massive open TikTok training set lots of video researchers use

Jaxkr · 2025-04-19T15:02:05 1745074925

This guy is a genius; for those who don’t know he also brought us ControlNet.

This is the first decent video generation model that runs on consumer hardware. Big deal and I expect ControlNet pose support soon too.

artninja1988 · 2025-04-19T18:26:42 1745087202

He also brought us IC-Light! I wonder why he's still contributing to open source... Surely all the big companies have made him huge offers. He's so talented

dragonwriter · 2025-04-19T18:57:57 1745089077

I think he is working on his Ph.D. at Stanford. I assume whatever offers he has haven't been attractive enough to abandon that, whether he’ll still be doing open work or get sucked into the bowels of some proprietary corporate behemoth afterwards remains to be seen, but I suspect he won't have trouble monetizing his skills either way.

msp26 · 2025-04-19T16:36:04 1745080564

I haven't bothered with video gen because I'm too impatient but isn't Wan pretty good too on regular hardware?

dragonwriter · 2025-04-19T18:32:48 1745087568

Wan 2.1 (and Hunyuan and LTXV, in descending ordee of overall video quality but each has unique strengths) work well—but slow, except LTXV—for short (single digit seconds at their usual frame rates — 16 for WAN, 24 for LXTV, I forget for Hunyuan) videos on consumer hardware. But this blows them entirely out of the water on the length it can handle, so if it does so with coherence and quality across general prompts (especially if it is competitive with WAN and Hunyuan on trainability for concepts it may not handle normally) it is potentially a radical game changer.

dragonwriter · 2025-04-19T20:42:30 1745095350

For completeness, I should note I'm talking about the 14B i2v and t2v WAN 2.1 models; there are others in the family, notably a set of 1.3B models that are presumably much faster, but I haven't worked with them as much

dewarrn1 · 2025-04-19T16:52:29 1745081549

LTX-Video isn't quite the same quality as Wan, but the new distilled 0.9.6 version is pretty good and screamingly fast.

https://github.com/Lightricks/LTX-Video

vunderba · 2025-04-19T17:00:14 1745082014

Wan 2.1 is solid but you start to get pretty bad continuity / drift issues when genning more than 81 frames (approx 5 seconds of video) whereas FramePack lets you generate 1+ minute.

Jaxkr · 2025-04-10T23:35:37 1744328137

This is great. I’ve thought about building something like this for myself.

Complaints:

- Discover It rotating categories not factored in. These are 5% and very significant.

- Can’t add Amazon Chase Visa, as someone else pointed out.

Also I would LOVE an iOS shortcut where I can just type in what I’m buying and you categorize it with an LLM and it just tells me what card to use.

Jaxkr · 2025-04-05T00:29:09 1743812949

Hi OP! Thanks for sharing.

I don’t totally understand the point of this. Why would you want to use a Canvas renderer for this use case? If you want to render a massive table, apps will render a subset of it on regular HTML elements like EveryUUID [1].

1: https://news.ycombinator.com/item?id=42342382

sroussey · 2025-04-05T00:33:03 1743813183

From the introduction:

uWrap exists to efficiently predict varying row heights for list and grid virtualization[1], a technique for UI performance optimization when rendering large, scrollable datasets.

[1] https://www.patterns.dev/vanilla/virtual-lists/

kevlened · 2025-04-05T00:38:59 1743813539

EveryUUID's virtual grid can assume every cell is the same height, but it's much more difficult if you assume cells have wrapped text. This is further complicated if you allow grid resizing.

bufferoverflow · 2025-04-05T02:14:03 1743819243

What if you have to display it in a webgl environment?

Jaxkr · 2025-03-30T06:48:21 1743317301

Well the key it wants is for Anthropic and you can’t run those models locally.

Jaxkr · 2025-03-29T20:05:39 1743278739

> a static files pipeline for Django with whitenoise, how is that not included by default?

It is. They have a file server in debug mode and recommend something like nginx for serving files in production (and provide a collectstatic command to make that easy).

People shouldn’t be using a WSGI server to serve static media. Whitenoise shouldn’t exist.

tinodb · 2025-03-30T10:32:02 1743330722

Plenty of websites can live with the reduced complexity of having their static files served directly by python. Hence it exists, and is useful.

Jaxkr · 2025-03-31T04:17:25 1743394645

I came back to this thread after realizing I whitenoise would solve my current problem...

I'm working on a small internal tool using Django. When I turned debug off, my files stopped serving. And for this small deployment, I really don't want to have to require a separate nginx server. I get it now.

Jaxkr · 2025-03-29T01:02:01 1743210121

I have been guilty of this. I will sometimes use MST when I should use MDT due to muscle memory. And if I say MT it could be ambiguous when you consider Arizona (which doesn’t observe daylight savings).

I will not write “X city local time” though, I will take the extra time to make sure my timezone is correct.

colanderman · 2025-03-29T02:56:16 1743216976

Often "X city local time" is what you want though. If I schedule a recurring meeting at 1 PM, most people will expect it recur at 1 PM local time. A 1 PM EDT meeting would become noon EST when the clocks switch, and no-one wants a lunchtime meeting.

onionisafruit · 2025-03-29T03:52:52 1743220372

You rarely need to give a time zone when you schedule something with a physical location.