AI Feed

Eugene Yan Tech Media September 22, 2024

Weights & Biases LLM-Evaluator Hackathon – Hackathon Judge

Being a human judge at the Weights & Biases LLM-as-a-Judge Hackathon

Anthropic Engineering Frontier Labs September 19, 2024

Introducing Contextual Retrieval

For an AI model to be useful in specific contexts, it often needs access to background knowledge.

EleutherAI Open Source September 19, 2024

The Practitioner's Guide to the Maximal Update Parameterization

Exploring the implementation details of muTransfer

Alibaba Qwen News September 18, 2024

Qwen2.5: A Party of Foundation Models!

GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD Introduction In the past three months since Qwen2’s release, numerous developers have built new models on the Qwen2 language models,…

AI Snake Oil (Narayanan) Newsletters September 18, 2024

Can AI automate computational reproducibility?

A new benchmark to measure the impact of AI on improving science

X · @01AI_Yi China Labs September 13, 2024

Great to see how easy it is to build a search page with Yi-Coder and Cursor! Check out this useful tutorial!

Great to see how easy it is to build a search page with Yi-Coder and Cursor! Check out this useful tutorial!Second State: Write a Search Webpage…

Aider Infrastructure September 12, 2024

o1-preview is SOTA on the aider leaderboard

Preliminary benchmark results for the new OpenAI o1 models.

AI Snake Oil (Narayanan) Newsletters September 10, 2024

Start reading the AI Snake Oil book online

The book was published September 2024

The Gradient Newsletters September 9, 2024

What's Missing From LLM Chatbots: A Sense of Purpose

LLM-based chatbots’ capabilities have been advancing every month. These improvements are mostly measured by benchmarks like MMLU, HumanEval, and MATH (e.g. sonnet 3.5, gpt-4o). However, as…

X · @01AI_Yi China Labs September 8, 2024

Huge thanks to @josephpollack for bringing this amazing demo! Now Yi-Coder's power is at your fingertips!

Huge thanks to @josephpollack for bringing this amazing demo! Now Yi-Coder's power is at your fingertips!Joseph Pollack #Ï 🎗️: 🙋🏻‍♂️hey there folks, just released a coding…

Eugene Yan Tech Media September 8, 2024

Building the Same App Using Various Web Frameworks

FastAPI, FastHTML, Next.js, SvelteKit, and thoughts on how coding assistants influence builders' choices.

X · @01AI_Yi China Labs September 7, 2024

We hear awesome feedback on our Sep 4 Yi-Coder release and so glad the community finds it helpful! Here's more scoop🍦on our tech blog — "Meet Yi-C…

We hear awesome feedback on our Sep 4 Yi-Coder release and so glad the community finds it helpful! Here's more scoop🍦on our tech blog -- "Meet…

X · @01AI_Yi China Labs September 5, 2024

RT VentureBeat: Yi-Coder: The open-source AI that wants to be your coding buddy https://venturebeat.com/ai/yi-coder-the-open-source-ai-that-wants-to-b…

RT VentureBeatYi-Coder: The open-source AI that wants to be your coding buddy https://venturebeat.com/ai/yi-coder-the-open-source-ai-that-wants-to-be-your-coding-buddy/

Allen Ai2 (Medium) Open Source September 5, 2024

Ai2 at DEF CON 32

By Technical Program Manager Christopher FiorelliAt Ai2, we’re continually looking for ways to expand the impact of AI research. Through collaboration with DSRI, Ai2 was invited…

Allen Ai2 (Medium) Open Source September 4, 2024

OLMoE: An open, small, and state-of-the-art mixture-of-experts model

We’re introducing OLMoE, jointly developed with Contextual AI, which is the first mixture-of-experts model to join the OLMo family. OLMoE brings two important aspects to the…

AI Snake Oil (Narayanan) Newsletters August 19, 2024

AI companies are pivoting from creating gods to building products. Good.

Turning models into products runs into five challenges

Eugene Yan Tech Media August 18, 2024

Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)

Use cases, techniques, alignment, finetuning, and critiques against LLM-evaluators.

Allen Ai2 (Medium) Open Source August 12, 2024

Digital Socrates: Evaluating LLMs through Explanation Critiques

Blog written by Yuling GuLooking for an interpretable explanation evaluation tool that can automatically characterize the explanation capabilities of modern LLMs? Meet Digital Socrates at ACL…

Allen Ai2 (Medium) Open Source August 8, 2024

Open research is the key to unlocking safer AI

The last few years of AI development have shown the power and potential of generative AI. Naturally, these leaps in machine intelligence have opened existential questions…

Allen Ai2 (Medium) Open Source August 5, 2024

Latest and greatest: Ai2’s release notes

Along with our rebrand, we’re excited to debut a new release note process. Because we’re making regular updates and new asset roll-outs in our open ecosystem…

EleutherAI Open Source August 5, 2024

Mechanistic Anomaly Detection Research Update

Interim report on ongoing work on mechanistic anomaly detection

X · @01AI_Yi China Labs August 5, 2024

🔥 Meet Yi-Large Turbo: the powerful, cost-effective upgrade to Yi-Large. Faster and more affordable at only $0.19 per 1M tokens for input and outpu…

🔥 Meet Yi-Large Turbo: the powerful, cost-effective upgrade to Yi-Large. Faster and more affordable at only $0.19 per 1M tokens for input and output. Ideal for…

The Gradient Newsletters August 3, 2024

We Need Positive Visions for AI Grounded in Wellbeing

IntroductionImagine yourself a decade ago, jumping directly into the present shock of conversing naturally with an encyclopedic AI that crafts images, writes code, and debates philosophy.…

EleutherAI Open Source July 30, 2024

Open Source Automated Interpretability for Sparse Autoencoder Features

Building and evaluating an open-source pipeline for auto-interpretability

Latest