Anthropic drops binding AI safety pledge in revised policy

February 27, 2026 Anthropic has revised its Responsible Scaling Policy, removing a binding commitment to halt development if its AI models outpace its ability to guarantee safety. The change replaces a hard stop requirement with a more flexible “Frontier Safety Roadmap,” shifting the decision to continue development to the company’s discretion.

When Anthropic launched Claude in March 2023, it positioned the assistant as “a next-generation AI assistant based on Anthropic’s research into training helpful, honest, and harmless AI systems.” That safety-first stance was formalized in September 2023 through its Responsible Scaling Policy (RSP), which stated that if model capabilities exceeded the company’s safety controls, Anthropic would pause training and deployment until safeguards caught up.

On Tuesday, the company published a rewritten version of the RSP that removes that unconditional pause commitment. Instead, the updated policy outlines risk mitigation steps and emphasizes public transparency about model risks and safeguards. The explicit trigger to halt development if safety lags has been eliminated.

Anthropic argues that overall AI risk depends on the actions of multiple developers. If one company pauses while others continue advancing models without strong mitigation measures, it says, the result could be a less safe environment where weaker safeguards define the competitive pace.

The revised policy does retain the option to delay development under certain conditions. Anthropic says it would consider slowing deployment if it has a “significant lead” over competitors or if there is strong evidence that all companies developing highly capable models are implementing robust safety measures. If competitors are progressing with weaker protections, however, the company states it “will not necessarily delay AI development and deployment in this scenario.”

The update coincided with a meeting between U.S. Defense Secretary Pete Hegseth and Anthropic CEO Dario Amodei, during which the company was reportedly told to roll back safeguards for military use or risk losing a $200-million Pentagon contract. With the policy revision, no major AI lab now maintains a binding public commitment to halt development if safety measures fall behind model capability.

Anthropic drops binding AI safety pledge in revised policy

Top Stories

Meta CTO admits employee morale among worst in company’s history

Ontario Residents to gain free credit lock protection

How can project sponsors help projects succeed?

AI pioneer Yann LeCun calls Elon Musk’s xAI a ‘failure’

Microsoft sued by shareholders over Azure growth and AI spending

SpaceX shares fall 20% from peak following $60 billion cursor acquisition

Related Articles

Polaroid takes aim at AI and data centres in new advertising push

Utah Senate President loses primary amid data centre backlash

Ford turns to veteran engineers after AI fell short on quality, report says

Meta CTO admits employee morale among worst in company’s history

Staff Writer

Staff Writer

Jim Love