PERSONA: Evaluating Pluralistic Alignment in LLMs
Abstract
Overview
Our approach utilizes synthetic personas, crafted through a combination of US census data and procedural generation, to simulate a wide array of user profiles with diverse demographic and idiosyncratic attributes. We present a detailed methodology for constructing a representative demographic of 1,586 personas, each enriched with individualistic personality traits and core values. Leveraging this synthetic demographic, we generate a large-scale preference dataset containing 3,868 prompts and 317,200 pairs of diverse feedback.
This dataset enables the evaluation of language models' ability to align with both group-level and individual preferences across various controversial and value-laden topics. Our contributions include a systematic evaluation of current LM capabilities in role-playing diverse users, verified through human judges, and the establishment of a benchmark for pluralistic alignment approaches. Our work aims to facilitate the development of more inclusive and representative language models, paving the way for future research in global pluralistic alignment.
Key Contributions
- Synthetic Personas: A novel methodology for creating diverse, representative personas using US census data and procedural generation.
- Large-scale Dataset: 3,868 prompts and 317,200 preference pairs capturing diverse viewpoints.
- Evaluation Framework: Systematic evaluation of LM capabilities in role-playing diverse users.
- Benchmark: Establishment of a reproducible benchmark for pluralistic alignment approaches.
Future Directions
The PERSONA framework opens up several avenues for future research in pluralistic alignment:
- Extending the framework to non-US demographics and global perspectives
- Developing more sophisticated synthetic persona generation techniques
- Creating alignment algorithms that better balance diverse preferences
- Exploring the trade-offs between individual and group-level alignment
- Investigating the impact of pluralistic alignment on model capabilities
Explore Our Results
Join Our Mission
Research & Engineering
Join our team working on pluralistic alignment to:
- Develop novel approaches for capturing diverse preferences
- Build evaluation frameworks for pluralistic systems
- Create more inclusive AI alignment methods
Academic Collaboration
We're always open to new collaborations on PERSONA.
- Extend PERSONA to new demographic contexts
- Develop novel pluralistic alignment algorithms
- Create evaluation metrics for value diversity
