They aren't, in the API too they're a thought summary, not nearly as useful as actually showing the thoughts. Anthropic still does provide raw thoughts to this day, showing how it's not necessary to keep a moat. Google and OpenAI don't.
to be fair, Anthropic reasoning models don't have long thinking to begin with, and I find their reasoning pretty useless compared to what Gemini used to do.
Long thinking seems to be a marketing term without clear definition, only applicable to the opaque chat frontends. If you give Anthropic models a hard problem and set the thinking budget high (API), it does plenty of reasoning and the CoT helps a lot with debugging. With Gemini and OpenAI you can't debug as the summaries tell you effectively nothing about why it's giving a wrong answer or going off the rails when it does som