Anthropic’s new Claude ‘constitution’: be helpful and honest, and don’t destroy humanity

Anthropic is overhauling Claude’s so-called « soul doc. »

The new missive is a 57-page document titled « Claude’s Constitution, » which details « Anthropic’s intentions for the model’s values and behavior, » aimed not at outside readers but the model itself. The document is designed to spell out Claude’s « ethical character » and « core identity, » including how it should balance conflicting values and high-stakes situations.

Where the previous constitution, published in May 2023, was largely a list of guidelines, Anthropic now says it’s important for AI models to « understand why we want them to behave in certain ways rather than just specifying w …

Read the full story at The Verge.

Leave a Comment Cancel Reply