Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I had forgotten about the film-grain extraction, which is a clever approach to a huge problem for compression.

But... did I miss it, or was there no mention of any tool to specify grain parameters up front? If you're shooting "clean" digital footage and you decide in post that you want to add grain, how do you convey the grain parameters to the encoder?

It would degrade your work and defeat some of the purpose of this clever scheme if you had to add fake grain to your original footage, feed the grainy footage to the encoder to have it analyzed for its characteristics and stripped out (inevitably degrading real image details at least a bit), and then have the grain re-added on delivery.

So you need a way to specify grain characteristics to the encoder directly, so clean footage can be delivered without degradation and grain applied to it upon rendering at the client.





You just add it to your original footage, and accept whatever quality degradation that grain inherently provides.

Any movie or TV show is ultimately going to be streamed in lots of different formats. And when grain is added, it's often on a per-shot basis, not uniformly. E.g. flashback scenes will have more grain. Or darker scenes will have more grain added to emulate film.

Trying to tie it to the particular codec would be a crazy headache. For a solo project it could be doable but I can't ever imagine a streamer building a source material pipeline that would handle that.


Mmmm, no, because if the delivery conduit uses AV1, you can optimize for it and realize better quality by avoiding the whole degrading round of grain analysis and stripping.

"I can't ever imagine a streamer building a source material pipeline that would handle that."

That's exactly what the article describes, though. It's already built, and Netflix is championing this delivery mechanism. Netflix is also famous for dictating technical requirements for source material. Why would they not want the director to be able to provide a delivery-ready master that skips the whole grain-analysis/grain-removal step and provides the best possible image quality?

Presumably the grain extraction/re-adding mechanism described here handles variable grain throughout the program. I don't know why you'd assume that it doesn't. If it didn't, you'd wind up with a single grain level for the entire movie; an entirely unacceptable result for the very reason you mention.

This scheme loses a major opportunity for new productions unless the director can provide a clean master and an accompanying "grain track." Call it a GDL: grain decision list.

This would also be future-proof; if a new codec is devised that also supports this grain layer, the parameters could be translated from the previous master into the new codec. I wish Netflix could go back and remove the hideous soft-focus filtration from The West Wing, but nope; that's baked into the footage forever.


I believe you are speculating on digital mastering and not codec conversion.

From the creator's PoV their intention and quality is defined in post-production and mastering, color grading and other stuff I am not expert on. But I know a bit more from music mastering and you might be thinking of a workflow similar to Apple, which allows creators to master for their codec with "Mastered for iTuenes" flow, where the creators opt-in to an extra step to increase quality of the encoding and can hear in their studio the final quality after Apple encodes and DRMs the content on their servers.

In video I would assume that is much more complicated, since there are many quality the video is encoded to allow for slower connections and buffering without interruptions. So I assume the best strategy is the one you mentioned yourself, where AV1 obviously detects on a per scene or keyframe interval the grain level/type/characteristics and encode as to be accurate to the source material at this scene.

In other words: The artist/director preference for grain is already per scene and expressed in the high bitrate/low-compression format they provide to Netflix and competitors. I find it unlikely that any encoder flags would specifically benefit the encoding workflow in the way you suggested it might.


"I believe you are speculating on digital mastering and not codec conversion."

That's good, since that's what I said.

"The artist/director preference for grain is already per scene and expressed in the high bitrate/low-compression format they provide to Netflix and competitors. I find it unlikely that any encoder flags would specifically benefit the encoding workflow in the way you suggested it might."

I'm not sure you absorbed the process described in the article. Netflix is analyzing the "preference for grain" as expressed by the grain detected in the footage, and then they're preparing a "grain track," as a stream of metadata that controls a grain "generator" upon delivery to the viewer. So I don't know why you think this pipeline wouldn't benefit from having the creator provide perfectly accurate grain metadata to the delivery network along with already-clean footage up front; this would eliminate the steps of analyzing the footage and (potentially lossily) removing fake grain... only to re-add an approximation of it later.

All I'm proposing is a mastering tool that lets the DIRECTOR (not an automated process) do the "grain analysis" deliberately and provide the result to the distributor.


You're misunderstanding.

> if the delivery conduit uses AV1, you can optimize for it

You could, in theory, as I confirmed.

> It's already built, and Netflix is championing this delivery mechanism.

No it's not. AV1 encoding is already built. Not a pipeline where source files come without noise but with noise metadata.

> and provides the best possible image quality?

The difference in quality is not particularly meaningful. Advanced noise-reduction algorithms already average out pixel values across many frames to recover a noise-free version that is quite accurate (including accounting for motion), and when the motion/change is so overwhelming that this doesn't work, it's too fast for the eye to be perceiving that level of detail anyways.

> This scheme loses a major opportunity for new productions unless the director can provide a clean master and an accompanying "grain track."

Right, that's what you're proposing. But it doesn't exist. And it's probably never going to exist, for good reason.

Production houses generally provide digital masters in IMF format (which is basically JPEG2000), or sometimes ProRes. At a technical level, a grain track could be invented. But it basically flies in the face of the idea that the pixel data itself is the final "master". In the same way, color grading and vector graphics aren't provided as metadata either, even though they could be in theory.

Once you get away from the idea that the source pixels are the ultimate source of truth and put additional postprocessing into metadata, it opens up a whole can of worms where different streamers interpret the metadata differently, like some streamers might choose to never add noise and so the shows look different and no longer reflect the creator's intent.

So it's almost less of a technical question and more of a philosophical question about what represents the finished product. And the industry has long decided that the finished product is the pixels themselves, not layers and effects that still need to be composited.

> I wish Netflix could go back and remove the hideous soft-focus filtration from The West Wing, but nope; that's baked into the footage forever.

In case you're not aware, it's not a postproduction filter -- the soft focus was done with diffusion filters on the cameras themselves, as well as choice of film stock. And that was the creative intent at the time. Trying to "remove" it would be like trying to pretend it wasn't the late-90's network drama that it was.


Nothing in there indicates "misunderstanding." You're simply declaring, without evidence, that the difference in quality "is not particularly meaningful." Whether it's meaningful or not to you is irrelevant; the point is that it's unnecessary.

You are ignoring the fact that the scheme described in the article does not retain the pixel data any more than what I'm proposing does; in fact, it probably retains less, even if only slightly. The analysis phase examines grain, comes up with a set of parameters to simulate it, and then removes it. When it's re-added, it's only a generated simulation. The integrity of the "pixel data" you're citing is lost. So you might as well just allow content creators to skip the pointless adding/analyzing/removing of grain and provide the "grain" directly.

Furthermore, you note that the creator may provide the footage as a JPEG2000 (DCP) or ProRes master; both of those use lossy compression that will waste quality on fake grain that's going to be stripped anyway.

Would they deliver this same "clean" master along with grain metadata to services not using AV1 or similar? Nope. In that case they'd bake the grain in and be on their way.

The article describes a stream of grain metadata to accompany each frame or shot, to be used to generate grain on the fly. It was acquired through analysis of the footage. It is totally reasonable to suggest that this analysis step can be eliminated and the metadata provided by the creator expressly.

And yes I'm well aware that West Wing was shot with optical filters; that's the point of my comment. The dated look is baked in. If the creators or owner wanted to rein in or eliminate it to make the show more relatable to modern audiences, they couldn't. Whether they should is a matter of opinion. But if you look at the restoration and updating of the Star Trek original series, you see that it's possible to reduce the visual cheesiness and yet not go so far as to ruin the flavor of the show.


Yes, it's technically possible. But what you are suggesting is basically a dynamic filter. The problem is that codes are designed for end delivery and have very specific practical constraints.

For example, we could GREATLY improve compression ratios if we could reference key frames anywhere in the file. But devices only have so much memory bandwidth and users need to be able to seek while streaming on a 4g connection on a commuter train. I would really like to see memes make use of SVG filters and the like, but basically everyone flattens them into a bitmap and does OCR to extract metadata.

It's also really depressing how little effort is put into encoding, even by the hyper-scalers. Resolution (SD, HD, 4k and 8k) is basically the ONLY knob used for bitrate and quality management. I would much prefer to have 10 bit color over an 8K stream yet every talking head documentary with colored gradient backgrounds has banding.

Finally, there is the horror that are decoders. There a reference files that use formal verification to excise every part of a codec's spec. But Hollywood studios have dedicated movie theaters with all of the major projectors and they pay people to prescreen movies just to try and catch encoding/decoding glitches. And even that fails sometimes.

So sure, anything is possible. Flash was very popular in the 56k days because it rendered everything on the end device. But that entails other tradeoffs like inconsistent rendering and variable performance requirements. Codecs today do something very similar: describe bitmap data using increasingly sophisticated mathematical representations. But they are more consistent and simplify the entire stack by (for example) eliminating a VM. Just run PDF torture tests through your printer if you want an idea of how little end devices care about rendering intent.


I'm not disagreeing with you as to what could be technically built.

I don't need to provide evidence that the resulting difference in quality is negligible because you can play with ffmpeg to verify it yourself if it want. I'm just telling you from experience.

I understand your logic that it could be built, and how it would skip steps and by definition have no loss of quality in that part. But process- and philosophy-wise it just doesn't fit, that's all I'm explaining.

And the fact that JPEG2000/ProRes are lossy is irrelevant. They're encoded at such high quality settings for masters that they become virtually lossless for practical purposes. That's why the source noise is so high-fidelity in the first place.


Actual film grain (i.e., photochemical) is arguably a valid source of information. You can frame it as noise, but does provide additional information content that our visual system can work with.

Removing real film grain from content and then recreating it parametrically on the other side is not the same thing as directly encoding it. You are killing a lot of information. It is really hard to quantify exactly how we perceive this sort of information so it's easy to evade the consequences of screwing with it. Selling the Netflix board on an extra X megabits/s per streamer to keep genuine film grain that only 1% of the customers will notice is a non-starter.


Exactly. In the case of stuff shot on film, there's little to be done except increase bitrate if you want maximal fidelity.

In the case of fake grain that's added to modern footage, I'm calling out the absurdity of adding it, analyzing it, removing it, and putting yet another approximation of it back in.


> I'm calling out the absurdity of adding it, analyzing it, removing it, and putting yet another approximation of it back in

Why is it absurd? The entire encoding process is an approximation of the original image. Lossy compression inevitably throws away information to make the file size smaller. And the creation of the original video is entirely separated from the distribution of it. It'll be stored losslessly for one thing.

The only question that matters is does the image look good enough after encoding. If it doesn't look good enough then no one will watch it. If it does look good enough then you've got a viable video distribution service that minimizes its bandwidth costs.


"Lossy compression inevitably throws away information to make the file size smaller."

So? That fact only emphasizes the absurdity and loss potential here:

1. Acquire "clean" digital footage.

2. Add fake grain to said footage

3. Compress the grainy footage with lossy compression, wasting a bunch of the data on fake detail that you just added.

4. Analyze the footage to determine the character of the fake grain, calculate parameters to approximate it later with other fake grain

5. Strip out the fake grain, with potential for loss of some original image details

6. Re-add fake grain with the calculated parameters

If you don't see the absurdity there, I don't know what to tell you.


No, the source material is stored losslessly. Many different effects will be applied to the source material to make the moving images look the way the director wants them to look.

The creation of the original video is separate from the distribution of the video. In distribution the video will be encoded to many formats and many bitrates to support playback on as many devices at as many network speeds as possible.

The distributed video will never exactly match the original. There simply isn't the bandwidth. The goal of video encoding is always just to make it look good enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: