AI ALIGNMENT FORUM
AF

Wikitags

Constitutional AI

Written by Benaya Koren, et al. last updated 11th Jul 2023

Constitutional AI is a method for fine-tuning language models, used in Anthropic's Claude. The main conceptual difference from RLHF is that instead of human feedback on specific behaviors it relies on the model's ability to apply general principles (stated in natural language) to specific situations.

Subscribe
1
Subscribe
1
Discussion0
Discussion0
Posts tagged Constitutional AI
2Contextual Constitutional AI
Akshat Naik
9mo
0
0Continuous Adversarial Quality Assurance: Extending RLHF and Constitutional AI
Benaya Koren
2y
0
2Emergent Misalignment and Emergent Alignment
Alvin Ånestrand
3mo
0
Add Posts