> Gaussian mixture models In what fields did neural networks replace Gaussian mi...

brandonb · 2025-12-15T16:50:01 1765817401

The acoustic model of a speech recognizer used to be a GMM, which mapped a pre-processed acoustic signal vector (generally MFCCs-Mel-Frequency Cepstral Coefficients) to an HMM state.

Now those layers are neural nets, so acoustic pre-processing, GMM, and HMM are all subsumed by the neural network and trained end-to-end.

One early piece of work here was DeepSpeech2 (2015): https://arxiv.org/pdf/1512.02595

ForceBru · 2025-12-15T18:31:26 1765823486

Interesting, thanks!