Not Chinchilla-optimal but Inference-optimal. Chinchilla optimality was related to the training budget and is of interest to researchers who produce mainly demos. Inference optimality includes the inference costs and is of interest in real deployments to millions of users. It is worth to pay more for training to reduce inference costs, so they probably went even further than Chinchilla.