How Do Large Language Monkeys Get Their Power (Laws)?

RSchaeffer · 2025-06-13T18:04:35 1749837875

Best of N was shown to exhibit power (polynomial) law scaling (left), but maths suggest one should expect exponential scaling (center). We show how to resolve this "paradox", then use our insights to design methods for predicting inference-scaling capabilities that can be more sample efficient!