kraddypatties's comments

kraddypatties · 2025-12-09T17:15:00 1765300500

Running into "no healthy upstream" when navigating to the link -- hug of death maybe?

cgorlla · 2025-12-09T17:25:31 1765301131

Indeed, we had a huge influx, should be back up now. Thanks for pointing it out

kraddypatties · 2025-11-27T20:40:00 1764276000

been lurking for most of my adult life (and it shows :-))

Thanks HN! You make me smarter every (other) day.

kraddypatties · 2025-11-24T23:52:11 1764028331

Thanks for trying it out!

Yea that latency makes sense; "listening" includes turn detection and STT, "thinking" LLM + TTS _and then_ our model, so the pipeline latency stacks up pretty quick. The actual video model starts streaming out frames <500ms from the TTS generation, but we're still working on reducing latency from parts of the pipeline that we are using off the shelf.

We have a high level blog post here https://www.keyframelabs.com/blog/persona-1 about the architecture of the video model, the WebRTC "agent" stack is Livekit + a few backend components hosted in Modal.

kraddypatties · 2025-11-10T01:31:04 1762738264

We've been tinkering with building realtime talking head models (avatar models, etc.) for a while now, and finally have something that works (well enough)! Operates at ~2x realtime on a 4090, significantly faster than that on enterprise grade GPUs.

You can try it yourself at https://playground.keyframelabs.com/playground/persona-1 and there's a (semi)technical blog post at https://www.keyframelabs.com/blog/persona-1

The main use case we designed for was language learning, particularly having a conversational partner -- generally we've found that adding a face to the voice really helps trigger the fight or flight response, which we've found to be the hardest part of speaking a new language with confidence.

But in building out the system around the model to enable that use case (tool use on a canvas for speaking prompts and images, memory to make conversations less stale, etc.), we think there's potential for other use cases too.