We work on frontier challenges in AI post-training with an open science approach
Discover our latest AI publications and updates from the research lab
While current language models excel at pattern recognition, they often struggle with tasks requiring genuine reasoning—the kind of multi-step, logical deduction that humans use to solve complex problems. Our research addresses this fundamental gap by developing models that learn how to think through problems, rather than merely memorizing solutions. We introduce novel approaches combining tree search algorithms, process guidance through reward models, and meta-learning frameworks to enable more robust reasoning capabilities. Our work spans mathematical proof generation, scientific problem-solving, and complex decision-making tasks, with the goal of advancing AI systems beyond pattern matching toward true analytical thinking.
Access frontier generative AI post-training capabilities as a research partner
Explore our models, contribute to research, and join our growing community of AI researchers and practitioners.
We introduce Generative Reward Models (GenRM), a novel approach to AI alignment that combines the strengths of human feedback and AI-generated feedback. Our research focuses on improving AI systems' ability to understand and adhere to human values and preferences across diverse contexts. By leveraging Chain-of-Thought (CoT) reasoning and innovative training techniques, GenRM aims to create more robust, generalizable, and ethically aligned AI systems.
SynthLabs and EleutherAI are excited to announce large scale post training and preference learning in GPT-NeoX, one of the most widespread and adopted pretraining frameworks for large scale language models. One of the many efforts within our deep partnership with EleutherAI is to improve the accessibility and performance of preference learning at scale.
PERSONA introduces a reproducible testbed designed to evaluate and improve LLM pluralistic alignment through 1,586 synthetic personas derived from US census data. The framework encompasses 3,868 prompts and 317,200 feedback pairs, establishing both PERSONA Bench for systematic evaluation of language models' role-playing capabilities and a comprehensive dataset for developing future alignment benchmarks.
Research Scientist
2024
This paper introduces Direct Preference Optimization (DPO), a novel approach for training language models that leverages preference data....
2021
This work presents Combo, a novel approach for offline reinforcement learning that combines model-based and conservative policy optimization technique...
2023
This paper introduces a large-scale, multi-embodiment robotics dataset and the RT-X family of models for robotic learning across diverse embodiments....
Research Scientist
2023
This paper introduces RWKV, a novel architecture that combines the efficiency of RNNs with the expressiveness of Transformers....
2023
This work presents Logic-LM, a method for enhancing language models with symbolic reasoning capabilities....
2024
This comprehensive survey explores various techniques for selecting and curating data for training language models....
Co-founder, Research Scientist
Research Scientist
Research Scientist
Research Scientist
Research Scientist
Our most recent 5 publications
5 publications
2025-01-07
Meta Chain-of-Thought: Unlocking System 2 Reasoning in LLMsDakota Mahan, Duy Van Phung, Rafael Rafailov, Chase Blagden, Nathan Lile, Louis Castricato, Jan-Philipp Fränken, Chelsea Finn, Alon Albalak
2024-10-3
GenRM: Generative Reward Models for AI AlignmentDakota Mahan, Duy Van Phung, Rafael Rafailov, Chase Blagden, Nathan Lile, Louis Castricato, Jan-Philipp Fränken, Chelsea Finn, Alon Albalak
2024-10-09
RLHF and RLAIF in GPT-NeoXDakota Mahan, Quentin Anthony, Louis Castricato, Nathan Lile, Stella Biderman
2024-07-24
PERSONA: A Reproducible Testbed for Pluralistic AlignmentLouis Castricato, Nathan Lile, Rafael Rafailov, Jan-Philipp Fränken, Chelsea Finn
2024-02-12
Suppressing pink elephants with direct principle feedbackLouis Castricato, Nathan Lile, Suraj Anand, Hailey Schoelkopf, Siddharth Verma, Stella Biderman
We're always open to new collaborations and ideas. If you're interested in working with us or have any questions, please reach out!