Ruff ci (#194)
Ruff ci (#194) * apply ruff * rename * specify ruff version for CI * also check imports * check formatting
Every story across every category, newest first. Each card links to the original publisher; daily-brief posts open as editorial pages.
Ruff ci (#194) * apply ruff * rename * specify ruff version for CI * also check imports * check formatting
API Documentation (Chinese) HuggingFace Demo ModelScope Demo Introduction After the release of Qwen2.5, we heard the community’s demand for processing longer contexts. In recent months, we…
Seemingly minor technical decisions can have life-or-death effects
GITHUB HUGGING FACE MODELSCOPE KAGGLE DEMO DISCORD Introduction Today, we are excited to open source the “Powerful”, “Diverse”, and “Practical” Qwen2.5-Coder series, dedicated to continuously promoting…
Merge pull request #619 from 01-ai/Anonymitaet-patch-2 Update README.md
Using interpretations of SAE latents to simulate activations.
We’re excited to announce that our blog is moving to its new home! From now on, all our new blog posts will be published directly on…
Merge pull request #618 from 01-ai/Haijian06-patch-2 Update README.md
ML systems, production & scaling, execution & collaboration, building for users, conference etiquette.
🌍Exciting news from our developer community!We're thrilled to share a blog on Refactor Earth, which explores an innovative approach to sustainable AI. By combining Yi-Large and…
An overview of the minetester and preliminary work
Thrilled to see such widespread adoption of Yi!Huge thanks to @huggingface, @ollama, and mradermacher for your incredible support!ollama run hf(.)co/mradermacher/Yi-1.5-34B-Chat-16K-GGUF#Yi34B #LLM #AIJulien Chaumond: The @ollama -…
Using LLM-as-a-Judge For Evaluation: A Complete Guide
Much of the recent advancements in large language models (LLMs) have been powered by human feedback, usually in the form of preference datasets. Think of preferences…
Look at and label your data, build and evaluate your LLM-evaluator, and optimize it against your labels.
“Theory of Mind” (ToM) is the ability to understand that others have their own thoughts and beliefs, even when they differ from ours — a skill…
Empowering conservation efforts through innovative technologies and global collaborationA vessel captured by NASA’s Landsat 8. Skylight’s computer vision models leverage this imagery to identify suspicious behavior,…
We are proud to present the latest model ⚡️Yi-Lightning ⚡️ now #6 in the world, higher than the original GPT-4o released 5 months ago. Also humbled…
We're thrilled to unveil Yi-Lightning and Yi-Lightning-Lite, our latest proprietary models! Both are now accessible via API at https://platform.lingyiwanwu.com and featured in @lmarena_ai's Chatbot Arena (https://lmarena.ai/).…
Interim report on ongoing work on mechanistic anomaly detection
GPT-NeoX now supports post-training thanks to a collaboration with SynthLabs.
What's in the book and how we wrote it
A central goal of the OLMo project is to use our experience to contribute to an open science of LM pretraining to provide a foundation for…
An Architect model describes how to solve the coding problem, and an Editor model translates that into file edits. This Architect/Editor approach produces SOTA benchmark results.