Persona Evaluation
persona-bench: An Evaluation Harness for Personalization & Reproducible Pluralistic Alignment
Human vs AI Personalization Challenge
You'll be competing against a frontier AI model in crafting personalized responses.
Disclaimer: Some questions may touch on sensitive topics. Please engage thoughtfully and respectfully. If you feel uncomfortable with any question, feel free to skip it.
Current Language Model's Ability to Successfully Personalize for a Known Demographic Varies Widely
Models
Method
Chart
Group
Sort
Metric
Want to see how your model performs?
Key Features
Rapid Evaluation
Assess performance across 1,000+ personas quickly
Published Research
Backed by our paper on arXiv
Proven at Scale
Tested with leading AI models
Seamless Integration
Easy to Implement
Compatible with popular AI frameworks and easy to integrate into your existing infrastructure.
Grounded in Frontier LLM Research
Our work is backed by rigorous academic research and collaborations. Read our paper for in-depth insights into our methodology and findings. We're also providing academic access to our datasets.
Read Our Research Paper