I have had very good results using Spectropic [1], a hosted Whisper Diarization ...

thomasmol · on Aug 9, 2024

Thanks for the shout-out and kind words!

Thomas here, maker of Spectropic and Audiogest. I am indeed focused on building a simple and reliable Whisper + diarization API. Also working on providing fine-tuned versions of Whisper of non-English languages through the API.

Feel free to reach out to me if anyone is interested in this!

dchuk · on Aug 9, 2024

Great looking API. Are you able to, or do you have plans, for there to be automatic speaker identification based on labeled samples of their voices? It would be great to basically have a library of known speakers that are auto matched when transcribing

thomasmol · on Aug 9, 2024

Thanks! That is something I might offer in the future and is definitely possible with a library like pyannote. Would be really cool to add for sure.

I am also experimenting with post-processing transcripts with LLMs to infer speaker names from a transcript. It works pretty decent already but it's still a bit expensive. I have this feature available under the 'enhanced' model if you want to check it out: https://docs.spectropic.ai/models/transcribe/enhanced

ukuina · on Aug 9, 2024

Hi! Any plans to support streaming transcription with diarization?

thomasmol · on Aug 10, 2024

Streaming is definitely on the to-do list! Its quite complex to stream both transcription + diarization, but we will get there eventually