AI Post-Training Research

We work on frontier challenges in AI post-training with an open science approach

Latest post-training research news

Discover our latest AI publications and updates from the research lab

Meta Chain-of-Thought: Unlocking System 2 Reasoning in LLMs

While current language models excel at pattern recognition, they often struggle with tasks requiring genuine reasoning—the kind of multi-step, logical deduction that humans use to solve complex problems. Our research addresses this fundamental gap by developing models that learn how to think through problems, rather than merely memorizing solutions. We introduce novel approaches combining tree search algorithms, process guidance through reward models, and meta-learning frameworks to enable more robust reasoning capabilities. Our work spans mathematical proof generation, scientific problem-solving, and complex decision-making tasks, with the goal of advancing AI systems beyond pattern matching toward true analytical thinking.

Read full article

Get Early Access

Access frontier generative AI post-training capabilities as a research partner

Apply Now

Open Source

Explore our models, contribute to research, and join our growing community of AI researchers and practitioners.

GitHub

Star our repositories

🤗

Hugging Face

Access our models

GenRM: Generative Reward Models for AI Alignment

We introduce Generative Reward Models (GenRM), a novel approach to AI alignment that combines the strengths of human feedback and AI-generated feedback. Our research focuses on improving AI systems' ability to understand and adhere to human values and preferences across diverse contexts. By leveraging Chain-of-Thought (CoT) reasoning and innovative training techniques, GenRM aims to create more robust, generalizable, and ethically aligned AI systems.

RLHF and RLAIF in GPT-NeoX

SynthLabs and EleutherAI are excited to announce large scale post training and preference learning in GPT-NeoX, one of the most widespread and adopted pretraining frameworks for large scale language models. One of the many efforts within our deep partnership with EleutherAI is to improve the accessibility and performance of preference learning at scale.

PERSONA: A Reproducible Testbed for Pluralistic Alignment

PERSONA introduces a reproducible testbed designed to evaluate and improve LLM pluralistic alignment through 1,586 synthetic personas derived from US census data. The framework encompasses 3,868 prompts and 317,200 feedback pairs, establishing both PERSONA Bench for systematic evaluation of language models' role-playing capabilities and a comprehensive dataset for developing future alignment benchmarks.

Join the team

Research Team

Rafael Mitkov Rafailov

Research Scientist

Selected Work

Direct preference optimization: Your language model is secretly a reward model
2024
This paper introduces Direct Preference Optimization (DPO), a novel approach for training language models that leverages preference data....
Combo: Conservative offline model-based policy optimization
2021
This work presents Combo, a novel approach for offline reinforcement learning that combines model-based and conservative policy optimization technique...
Open x-embodiment: Robotic learning datasets and rt-x models
2023
This paper introduces a large-scale, multi-embodiment robotics dataset and the RT-X family of models for robotic learning across diverse embodiments....

Alon Albalak

Research Scientist

Selected Work

RWKV: Reinventing RNNs for the Transformer Era
2023
This paper introduces RWKV, a novel architecture that combines the efficiency of RNNs with the expressiveness of Transformers....
Logic-LM: Empowering large language models with symbolic solvers for faithful logical reasoning
2023
This work presents Logic-LM, a method for enhancing language models with symbolic reasoning capabilities....
A survey on data selection for language models
2024
This comprehensive survey explores various techniques for selecting and curating data for training language models....