The acoustic model of a speech recognizer used to be a GMM, which mapped a pre-processed acoustic signal vector (generally MFCCs-Mel-Frequency Cepstral Coefficients) to an HMM state.
Now those layers are neural nets, so acoustic pre-processing, GMM, and HMM are all subsumed by the neural network and trained end-to-end.
In what fields did neural networks replace Gaussian mixtures?