Anthropic drops binding AI safety pledge in revised policy

February 27, 2026 Anthropic has revised its Responsible Scaling Policy, removing a binding commitment to halt development if its AI models outpace its ability to guarantee safety. The change replaces a hard stop requirement with a more flexible “Frontier Safety Roadmap,” shifting the decision to continue development to the company’s discretion.

When Anthropic launched Claude in March 2023, it positioned the assistant as “a next-generation AI assistant based on Anthropic’s research into training helpful, honest, and harmless AI systems.” That safety-first stance was formalized in September 2023 through its Responsible Scaling Policy (RSP), which stated that if model capabilities exceeded the company’s safety controls, Anthropic would pause training and deployment until safeguards caught up.

On Tuesday, the company published a rewritten version of the RSP that removes that unconditional pause commitment. Instead, the updated policy outlines risk mitigation steps and emphasizes public transparency about model risks and safeguards. The explicit trigger to halt development if safety lags has been eliminated.

Anthropic argues that overall AI risk depends on the actions of multiple developers. If one company pauses while others continue advancing models without strong mitigation measures, it says, the result could be a less safe environment where weaker safeguards define the competitive pace.

The revised policy does retain the option to delay development under certain conditions. Anthropic says it would consider slowing deployment if it has a “significant lead” over competitors or if there is strong evidence that all companies developing highly capable models are implementing robust safety measures. If competitors are progressing with weaker protections, however, the company states it “will not necessarily delay AI development and deployment in this scenario.”

The update coincided with a meeting between U.S. Defense Secretary Pete Hegseth and Anthropic CEO Dario Amodei, during which the company was reportedly told to roll back safeguards for military use or risk losing a $200-million Pentagon contract. With the policy revision, no major AI lab now maintains a binding public commitment to halt development if safety measures fall behind model capability.

Anthropic drops binding AI safety pledge in revised policy

Top Stories

OpenAI cuts Sora with $1M-a-day costs to free up resources for core AI models

AI boom faces new constraint as helium disruption slows chip production

New Google update targets sensitive data exposure in search results

GitHub Copilot to train on user data by default

Microsoft pulls Copilot Chat from core Office apps for enterprise customers

OpenAI pauses ChatGPT erotic mode “indefinitely”

Related Articles

Woman jailed after facial recognition links her to crimes she didn’t commit

DeepSeek suffers 7-hour disruption as startup prepares next-gen AI model

OpenAI cuts Sora with $1M-a-day costs to free up resources for core AI models

France taps Mistral AI for sovereign defence AI rollout

Staff Writer

Staff Writer

Jim Love

Follow Us

Popular categories

Tech News Delivered