Finding sources for input data is something I struggle with when building deep learning models. Out of curiosity, how did you go about programmatically accessing the music files for all 120M+ songs, in order to create your embedding vector? I can't imagine iTunes has an API which would let a person do that.
Also would like to know. I can't even listen to the full songs, and assuming I have to pay. I can't imagine buying 120 million songs, so it has to be some collab with iTunes.
Thinking about both processing time and the difficulty of sustaining 120M downloads' worth of programmatic access, I wouldn't be surprised if this is actually trained on the track previews.