Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've found that WhisperX with the medium model has been amazing at subtitling shows containing English dialects (British, Scottish, Australian, New Zealand-ish). It not only nails all the normal speech, but even gets the names and completely made up slang words. Interestingly you can tell it was trained from source material with dialects because it subtitles their particular spelling; so someone American will say color, and someone British will say colour.

I can't speak to how it performs outside of production quality audio, but in the hundreds of hours of subtitles that I've generated I don't think I've seen a single error.



IIUC, it's trained on LibriVox audio mainly, along with a few other sources. I'm not sure how it is handling spelling as the spelling will depend on the source content being read, unless the source text has been processed/edited to align with the dialect.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: