Also images and text have tons of recurring patterns that can be exploited to train big models with lots of data. There is an internet worth of each modality that at least generally can all contribute helping a model build up a better overall understanding.
There is no analog for tabular data, it's all different.
There is no analog for tabular data, it's all different.