I'm not sure about this. I used gemini and claude for about 12 hours a day for a month and a half straight in an unhealthy programmer bender and claude was FAR superior. It was not really that close. Going to be interesting to test gemini 3 though.
Gemini 2.5 is prone to apology loops, and often confuses its own thinking to user input, replying to itself. Chat GPT 5 likes to refuse tasks with "sorry I can't help with that". At least in VSCode's GitHub Copilot Agent mode. Claude hasn't screwed up like that for me.