You may want to look at this. Neural network models with enough capacity to memorize random labels are still capable of generalizing well when fed actual data
Zhang et al (2021) 'Understanding deep learning (still) requires rethinking generalization'
Zhang et al (2021) 'Understanding deep learning (still) requires rethinking generalization'
https://dl.acm.org/doi/10.1145/3446776