AI Feed

Hamel Husain Tech Media October 29, 2024

Using LLM-as-a-Judge For Evaluation: A Complete Guide

Allen Ai2 (Medium) Open Source October 28, 2024

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Much of the recent advancements in large language models (LLMs) have been powered by human feedback, usually in the form of preference datasets. Think of preferences…

Eugene Yan Tech Media October 27, 2024

AlignEval: Building an App to Make Evals Easy, Fun, and Automated

Look at and label your data, build and evaluate your LLM-evaluator, and optimize it against your labels.

Allen Ai2 (Medium) Open Source October 24, 2024

Applying Theory of Mind: Can AI Understand and Predict Human Behavior?

“Theory of Mind” (ToM) is the ability to understand that others have their own thoughts and beliefs, even when they differ from ours — a skill…

Allen Ai2 (Medium) Open Source October 17, 2024

Ai2 at COP 16: Harnessing AI and Conservation Tech to Protect Our Planet

Empowering conservation efforts through innovative technologies and global collaborationA vessel captured by NASA’s Landsat 8. Skylight’s computer vision models leverage this imagery to identify suspicious behavior,…

X · @01AI_Yi China Labs October 15, 2024

We are proud to present the latest model ⚡️Yi-Lightning ⚡️ now #6 in the world, higher than the original GPT-4o released 5 months ago. Also humble…

We are proud to present the latest model ⚡️Yi-Lightning ⚡️ now #6 in the world, higher than the original GPT-4o released 5 months ago. Also humbled…

X · @01AI_Yi China Labs October 14, 2024

We're thrilled to unveil Yi-Lightning and Yi-Lightning-Lite, our latest proprietary models! Both are now accessible via API at https://platform.lingyi…

We're thrilled to unveil Yi-Lightning and Yi-Lightning-Lite, our latest proprietary models! Both are now accessible via API at https://platform.lingyiwanwu.com and featured in @lmarena_ai's Chatbot Arena (https://lmarena.ai/).…

EleutherAI Open Source October 14, 2024

Mechanistic Anomaly Detection Research Update 2

Interim report on ongoing work on mechanistic anomaly detection

EleutherAI Open Source October 10, 2024

RLHF and RLAIF in GPT-NeoX

GPT-NeoX now supports post-training thanks to a collaboration with SynthLabs.

AI Snake Oil (Narayanan) Newsletters October 4, 2024

FAQ about the book and our writing process

What's in the book and how we wrote it

Allen Ai2 (Medium) Open Source October 1, 2024

Investigating Pretraining Dynamics and Stability with OLMo Checkpoints

A central goal of the OLMo project is to use our experience to contribute to an open science of LM pretraining to provide a foundation for…

Aider Infrastructure September 26, 2024

Separating code reasoning and editing

An Architect model describes how to solve the coding problem, and an Editor model translates that into file edits. This Architect/Editor approach produces SOTA benchmark results.