We’re moving our blog!
We’re excited to announce that our blog is moving to its new home! From now on, all our new blog posts will be published directly on…
We’re excited to announce that our blog is moving to its new home! From now on, all our new blog posts will be published directly on…
An overview of the minetester and preliminary work
Much of the recent advancements in large language models (LLMs) have been powered by human feedback, usually in the form of preference datasets. Think of preferences…
“Theory of Mind” (ToM) is the ability to understand that others have their own thoughts and beliefs, even when they differ from ours — a skill…
Empowering conservation efforts through innovative technologies and global collaborationA vessel captured by NASA’s Landsat 8. Skylight’s computer vision models leverage this imagery to identify suspicious behavior,…
Interim report on ongoing work on mechanistic anomaly detection
GPT-NeoX now supports post-training thanks to a collaboration with SynthLabs.
A central goal of the OLMo project is to use our experience to contribute to an open science of LM pretraining to provide a foundation for…
Exploring the implementation details of muTransfer
By Technical Program Manager Christopher FiorelliAt Ai2, we’re continually looking for ways to expand the impact of AI research. Through collaboration with DSRI, Ai2 was invited…
We’re introducing OLMoE, jointly developed with Contextual AI, which is the first mixture-of-experts model to join the OLMo family. OLMoE brings two important aspects to the…
Blog written by Yuling GuLooking for an interpretable explanation evaluation tool that can automatically characterize the explanation capabilities of modern LLMs? Meet Digital Socrates at ACL…
The last few years of AI development have shown the power and potential of generative AI. Naturally, these leaps in machine intelligence have opened existential questions…
Along with our rebrand, we’re excited to debut a new release note process. Because we’re making regular updates and new asset roll-outs in our open ecosystem…
Interim report on ongoing work on mechanistic anomaly detection
Building and evaluating an open-source pipeline for auto-interpretability
Writing up results from a recent project
Achieving even more surgical edits than LEACE without concept labels at inference time.
Writing up results from a project from Spring 2023
Trained T5 on the Pile
Setting the record straight regarding Yi-34B and Llama 2.
Announcing a new resource, the FM Dev Cheatsheet.
Achieving even more surgical edits than LEACE when we have concept labels at inference time.
Explaining a result by Sam Marks and Max Tegmark