Skip to content
Allen Ai2 (Medium) · Open Source

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Much of the recent advancements in large language models (LLMs) have been powered by human feedback, usually in the form of preference datasets. Think of preferences as a judgment between two (or more) model outputs given a user prompt. Say for example you’re deciding between two ad copies for a sourdough business you’