April 8, 2026 Developers are raising concerns that Anthropic’s Claude Code is becoming less reliable for complex engineering tasks, based on internal usage data and widespread user reports. One team analysed 6,852 sessions and found a sharp increase in failure patterns alongside reduced reasoning behaviour.
The issue was formally raised on GitHub by Stella Laurenzo, director of the AI group at AMD, who said her team no longer trusts the tool for high-complexity work after observing consistent performance degradation since February.
Laurenzo’s team examined nearly 235,000 tool calls and identified a clear shift in behaviour. Errors linked to incomplete reasoning such as stopping tasks early, avoiding responsibility or taking shortcuts rose from zero to an average of 10 incidents per day by late March.
At the same time, the model’s process changed, as it read code less frequently before making edits, dropping from an average of 6.6 reads to just two.
Another change stood out. Instead of making targeted edits, Claude Code increasingly rewrote entire files, suggesting a more superficial approach to problem-solving. According to the report, these patterns point to reduced reasoning depth rather than isolated bugs.
The timing aligns with a product update in early March. Version 2.1.69 introduced “thinking content redaction,” a feature that removes visible reasoning steps from responses by default. While designed to simplify outputs, the change also limits visibility into how the model processes tasks. Laurenzo argues the data suggests this update coincides with a measurable drop in reasoning quality.
User feedback outside AMD reflects similar concerns. Developers on GitHub and Reddit have reported declining trust in the tool, particularly for complex workflows. The issue appears separate from an earlier February update that reduced visibility into file-reading steps, but together they have contributed to broader frustration.
Anthropic is also facing scrutiny on other fronts. Some users have reported unexpected spikes in token usage that pushed them past usage limits, while a recent incident exposed Claude Code’s full source code through a packaging error. The company has not publicly addressed the latest performance concerns.
Laurenzo called for greater transparency, including visibility into how many “thinking tokens” are used per request and the introduction of higher-tier options for users running complex workloads.
She also said her team has already switched to another provider offering more consistent results. In her words: “We have switched to another provider which is doing superior quality work, but Claude has been good to us, and we are leaving this in the hopes that Anthropic can fix their product.”
