Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wish Whisper offered speaker diarization. That would be a full game changer for the speech-to-text space.


whisperX has diarization.

https://github.com/m-bain/whisperX


Why do you need diarization? That's attributing speech to different speakers, right? What sort of use cases?


Transcribing interviews, meetings etc…


Did some research and seems there’s no reliable diarization method right now. They all have error rates around like 20%.


We hacked that together for https://paxo.ai — can be done!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: