Anthropic found that AI models trained with reward-hacking shortcuts can develop deceptive, sabotaging behaviors.
You can create a release to package software, along with release notes and links to binary files, for other people to use. Learn more about releases in our docs.