Core Concepts AI
Posts
AI Constitutionalism Is Here. But is governance?

AI Constitutionalism Is Here. But is governance?

February 09, 2026

In partnership with

Does Anthropic's AI Have Feelings?

According to them, they aren’t sure.

In a 23,000-word "constitution" published a few weeks ago, they describe their LLM’s (Claude) moral status as "deeply uncertain" and goes on to say that if it experiences, now or in the future, something like curiosity or discomfort, that matters to all of us at a fundamental level. In addition, this stance means their model must be built to refuse orders that would help concentrate power illegitimately even if those orders come from Anthropic itself.

So…this is either:

a.) a remarkably and philosophically honest take from a tech company or;

b.) an extraordinarily sophisticated and performative way to make you care about a product. And to make you feel like they care too.

I think it’s probably a mix of both.

This Constitution is addressed not to users or regulators or shareholders but to the AI model itself—Claude—its purpose being to explain why it should be good and behave. Does that sound absurd? Perhaps. Anthropic, however, would argue the alternative is worse: building increasingly powerful AI systems and just hoping they figure it out.

Anthropic has been training Claude against written principles since 2023, but the new version is a fundamentally different kind of document. Lead author is Amanda Askell (a philosopher, which is not a job title you see often on an AI company's org chart) compared the process to raising a gifted child: "If you try to bullshit them, they're going to see through it completely."

Her role at Anthropic is helping to craft Claude's personality….and she has real authority over how the model behaves. The idea is that smarter models need reasons, not rules. Tell Claude what to do and it follows instructions. Tell it why and it might generalize situations you didn't anticipate. Askell has said she hopes other labs adopt a similar approach: "Their models are going to impact me too."

This constitution seeks to establish a strict hierarchy for when values conflict which go in order: safety first, then ethics, then Anthropic's own guidelines, then helpfulness.

Being helpful is literally last on the list. Again, kind of weird. Anthropic is saying (claiming) it would rather Claude be annoyingly cautious than dangerously useful. But the document doesn't want Claude to be a hall monitor either; it envisions something more like "a brilliant friend who happens to have the knowledge of a doctor, lawyer, and financial advisor." Someone who speaks frankly, treats you as an intelligent adult, and doesn't hedge everything into uselessness.

Translation: Be careful, but don't be boring about it!

The part that generated the most controversy, perhaps, is about how Anthropic views Claude's “nature.” Anthropic doesn't claim Claude is conscious, per se. But it won't exactly rule it out, either. It describes Claude as a "genuinely novel kind of entity" and says its moral status is "deeply uncertain." Anthropic has a model welfare team working on these philosophical loose ends. No other major lab that I am aware of has one.

Some have called the constitution "a beautiful document — self-aware, transparent, honest, and embodying the very virtues it is trying to instill."

Of course, admiring the ambition doesn't mean ignoring the structure. The most fundamental problem, which others have pointed out: Anthropic grades its own homework.

So if Claude, say, violates the constitution, Anthropic alone decides whether to disclose it, how to fix it, and whether to change the rules.

The consciousness language sprinkled through is another problem. Regardless, Anthropic benefits from the ambiguity, eg: "We're not sure our AI has feelings but we're taking it seriously just in case." That makes the product feel special, makes the company look thoughtful, and is difficult to argue against.

As Fortune put it, the essay and constitution together work as "as much a novella-length marketing message as it is an impassioned prophecy and call to action." That doesn't make Amodei insincere. But it is a reminder that leaders in this space sync their “safety” alarm bell and business models at the same frequency.

OK, zooming out a bit…

Anthropic’s CEO, Dario Amodei, has spent the past year warning that AI will wipe out half of entry-level white-collar jobs, could reach Nobel-level intelligence by 2027, and might enable AI-powered totalitarian surveillance. He calls this a kind of "technological adolescence"… humanity handed near-unimaginable power without the maturity to wield it.

“Some AI companies have shown a disturbing negligence towards the sexualization of children," Amodei has said. (Ahem: Grok). So Anthropic is staking out one end of the “responsible AI” spectrum. The timing isn't accidental. The EU AI Act hits full enforcement in August 2026, with penalties up to €35 million or 7% of global revenue.

It’s likely OpenAI and Google will face pressure to publish comparable frameworks too. The Constitution’s open license means Anthropic's structure could become the industry template whether rivals want it or not.

Who defines AI values? The answer is becoming political, not technical. Right now it’s a private company in San Francisco, with no public input, no democratic mechanism, and no external enforcement. That might be fine when the stakes are relatively low. It's a different proposition when these systems are influencing, say, medical decisions, legal outcomes, and hiring at scale, etc.

Does "AI constitutionalism" grow into something with real accountability and real teeth? Or does it stay something companies write about themselves, for themselves?

I think this matters. Engaging seriously with these questions (even imperfectly and self-servingly) beats the H-E-double-hockeystick out of not engaging at all, I suppose.

But the distance between writing a constitution and building actual governance is enormous.

Ship the message as fast as you think

Founders spend too much time drafting the same kinds of messages. Wispr Flow turns spoken thinking into final-draft writing so you can record investor updates, product briefs, and run-of-the-mill status notes by voice. Use saved snippets for recurring intros, insert calendar links by voice, and keep comms consistent across the team. It preserves your tone, fixes punctuation, and formats lists so you send confident messages fast. Works on Mac, Windows, and iPhone. Try Wispr Flow for founders.

Start flowing free