Anthropic has announced a new research program focused on exploring “model welfare,” investigating whether AI systems might one day require moral consideration. The initiative, led by AI welfare researcher Kyle Fish, will study signs of model distress and examine potential low-cost interventions, even as the scientific community remains divided on the possibility of AI consciousness.
Anthropic stated it is approaching the topic “with humility and as few assumptions as possible,” saying that conclusions will evolve as research progresses. Fish told The New York Times he believes there is a 15% chance that models like Claude could already be conscious. Despite skepticism from academics like King’s College London’s Mike Cook, Anthropic aims to proactively address ethical questions as AI capabilities advance.