2 Comments
User's avatar
Rainbow Roxy's avatar

Wow, thinking variant. More internal reasoning… how so?

Rico's avatar

Hi Rainbow Roxy, they’re doing a couple of things, the first pass would be a model router which would analyse the prompt and send it to the appropriate model (fast or thinking). Then somehow they determine how much compute (thinking tokens) to give to that prompt.

That’s OpenAI’s moat, the infrastructure around the model.