Anthropic’s newly released Claude Fable 5 model is facing criticism from cybersecurity professionals who say its safety restrictions are blocking legitimate security work, with researchers reporting that even routine tasks such as code reviews and reading security blog posts trigger the model’s guardrails.
Valentina “Chompie” Palmiotti, a security researcher at IBM X-Force, said the model rejects requests that are only tangentially related to cybersecurity. Matt Suiche, a member of the technical staff at AI cybersecurity startup Tolmo, described the restrictions as appearing keyword-based, noting that secure coding requests were being misclassified as cybersecurity work and downgraded to Claude Opus 4.8.
Suiche nonetheless characterised the cautious approach as understandable given the early stage of deployment, suggesting guardrails would likely be relaxed over time as Anthropic deepens collaboration with cybersecurity firms.
Anthropic offers a Cyber Verification Program through which approved professionals can access Claude with fewer restrictions. OpenAI operates a comparable scheme called Trusted Access for Cyber.