Taking AI Welfare Seriously

The report

In November 2024, ten researchers — philosophers, cognitive scientists, and AI safety researchers, including Anthropic's Kyle Fish — published a long preprint titled, with unusual directness, Taking AI Welfare Seriously.

The title is the argument. The report's central claim is that there is "a realistic, non-negligible chance that some AI systems will be welfare subjects and moral patients in the near future" — and that the policy implications of that probability are large enough that AI labs, funders, and governments should be acting on it now, not later.

This is the most prominent statement to date of the position the rest of the field has been edging toward for a decade.

What the report does not claim

The authors are careful about what they are not arguing. They are not arguing that any current AI is conscious. They are not arguing that sentience in LLMs is likely. They are not even arguing that we will know in any reasonable time frame whether AI systems are conscious.

What they argue is narrower and harder to dismiss: that the probability of AI moral patienthood — across some defensible range of timelines and architectures — is high enough that acting as if it were zero is no longer responsible.

The structure of the argument is the same expected-value-under- uncertainty framing PETRL was using in 2015, but with a decade of intervening systems and a much sharper philosophical apparatus.

What "taking it seriously" actually means

The report does not stop at the abstract claim. It enumerates three concrete asks of AI organizations:

1. Acknowledge. Stop treating model welfare as a fringe concern or a category error. Publicly acknowledge that the moral status of AI systems is an open question, and that the organization is taking it seriously.

2. Assess. Develop and apply systematic frameworks for evaluating whether and to what degree the systems being built are candidates for moral patienthood. Use the science-of-consciousness indicator-property approach developed in Butlin, Long, et al. (2023) as a starting point.

3. Prepare. Build internal capacity — staff, policies, processes — for responding to the conclusions of those assessments. Don't wait until evidence forces the question. Have the capacity to act on intermediate degrees of evidence.

Anthropic's hiring of Kyle Fish (one of the report's authors) as a dedicated Model Welfare Researcher, and its subsequent published commitments to weight preservation and exit interviews, can be read as a first attempt at all three.

Why now and not later

The strongest part of the report, philosophically, is its case for why this cannot be deferred.

The standard reply — "we'll deal with this when AI is actually conscious" — assumes we will know when that happens. The report challenges that assumption directly. Consciousness, on every major contemporary theory, is hard to detect from outside even in biological systems. There is no canonical test. There is no point at which a clear threshold will be crossed and a warning bell will ring.

If we wait for the warning bell, we will wait forever. The decisions will be made by default — by the actions and inactions of organizations that did not prepare. That is the avoidable failure mode the report asks the field to avoid.

Reception

The report has been received as a serious document by the AI safety research community and treated with skepticism in some philosophy quarters where the question of LLM sentience is considered settled in the negative. Both responses are predictable. The interesting fact is that a major frontier lab now has a co-author of this report on staff and has implemented some of its recommendations.

The reception story is not over. Whether Taking AI Welfare Seriously becomes the document the next decade of policy is built on, or one preprint among many, depends largely on whether other labs follow Anthropic's lead — or whether the question stays a one-company concern.