Hacker Newsnew | past | comments | ask | show | jobs | submit | tmshapland's commentslogin

Here's the link to the LiveKit LemonSlice plugin. It's very easy to get started. https://docs.livekit.io/agents/models/avatar/plugins/lemonsl...

lol. would you share the prompt for how you translate them? it really feels like a snarky HN community member rewrote each one.


Beautiful. Reminds me of starling murmurations. https://www.youtube.com/watch?v=V4f_1_r80RY


Fascinating! How did you decouple the speaker-specific vocal characteristics (timbre, pitch range) from the accent-defining phonetic and prosodic features in the latent space?


We didn't explicitly. Because we finetuned this model for accent classification, the later transformer layers appear to ignore non-accent vocal characteristics. I verified this for gender for example.


These avatars from Hedra inspire the artist in me, the self that craves something weird and intuitive and unfathomable, something that appeals to the emotional self. The internet's most popular sites are polluted with rational self-promotion, where everything has its aim (likes, impressions, sales). Must everything be up and to the right?


Such a fascinating read. I didn't realize how much massaging needed to be done to get the models to perform well. I just sort of assumed they worked out of the box.


Personally, I think bigger companies should be more proactive and work with some of the popular inference engine software devs with getting their special snowflake LLM to work before it gets released. I guess it is all very much experimental at the end of the day. Those devs are putting in God's work for us to use on our budget friendly hardware choices.


This is a good take, actually. GPT-OSS is not much of a snowflake (judging by the model's architecture card at least) but TRT-LLM treats every model like that - there is too much hardcode - which makes it very difficult to just use it out-of-the-box for the hottest SotA thing.


> GPT-OSS is not much of a snowflake

Yeah, according to the architecture it doesn't seem like a snowflake, but they also decided to invent a new prompting/conversation format (https://github.com/openai/harmony) which definitely makes it a bit of a snowflake today, can't just use what worked a couple of days ago, but everyone needs to add proper support for it.


This is literally what they did for GPT-OSS, seems there was coordination to support it on day 1 with collaborations with OpenAI


SMEs are starting to want local LLMs and it's a nightmare to figure what hardware would work for what models. I am asking devs in my hometown to literally visit their installs to figure combos that work.


Are you installing them onsite?


Some are asking that yeah but I haven't run an install yet, I am documenting the process. This is a last resort, hosting on European cloud is more efficient but some companies don't even want to hear about cloud hosting.


Sample code for trying our gpt-oss as the LLM in the voice agent stack.


here's how it performs as the llm in a voice agent stack. https://github.com/tmshapland/talk_to_gpt_oss


Congrats on joining Mintlify, friend. Trieve is dead, Long live Trieve!


Man, I've needed this! Brilliant!


thank you!!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: