A lot. OpenAI uses Triton for their critical kernels. Meta has torch.compile using it too. I know Anthropic is not using Triton but I think their homegrown compiler is also Python. xAI is using CUTLASS which is C++ but I wouldn't be surprised if they start using the Python API moving forward.
How many of the world's total FLOPs are going through those?