I listened to a talk by Jim Keller from Tens torrent and their different approach to making AI cores - 5 Risc V cores one core for loading data, one for uploading data and the rest dedicated to performing matrix operations.
He did mention Google's TPU and the fact it was like programming a VLIW and they had about 500 people dedicated to their compiler.
He did mention Google's TPU and the fact it was like programming a VLIW and they had about 500 people dedicated to their compiler.