Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
martzoukos
4 months ago
|
parent
|
context
|
favorite
| on:
FFmpeg 8.0 adds Whisper support
I guess that there is no streaming option for sending generated tokens to, say, an LLM service to process the text in real-time.
nomad_horse
4 months ago
[–]
Whisper has the encoder-decoder architecture, so it's hard to run streaming efficiently, though whisper-streaming is a thing.
https://kyutai.org/next/stt
is natively streaming STT.
woodson
4 months ago
|
parent
[–]
There are many streaming ASR models based on CTC or RNNT. Look for example at sherpa (
https://github.com/k2-fsa/sherpa-onnx
), which can run streaming ASR, VAD, diarization, and many more.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: