Why does this model think in Chinese and o1 think in English? Is this because chain-of-thought is achieved by training these models on examples of what “thinking” looks like, which have been constructed by their respective model developers, as opposed to being a more generic feature?