Windsurf Queued Messages Release
Windsurf now has Queued Messages!
Windsurf now has Queued Messages!
Cache your compiled models for faster boot and inference times
Introducing the Next Generation of Compound on GroqCloud
DeepWiki, Vibe and Replace, Dev Containers, and more!
Use our MCP to discover, compare, and run models from apps like Claude, Cursor, and VS Code.
Ollama partners with OpenAI to bring gpt-oss to Ollama and its community.
You'll be surprised what you can do with AI video now.
Wan 2.2 is our fastest, cheapest video model.
Ollama's new app is now available for macOS and Windows.
We compare the best image models for generating consistent characters from a single reference image.
We've partnered with Bria to bring a suite of commercial-grade image generation and editing models to Replicate. Built entirely on licensed data, Bria’s tools are designed…
A deep-dive into the Taylor Seer optimization technique
It's hard keeping up with every new video model. In this post we'll help you pick the best one for your needs.
We hosted a hackathon with BFL for FLUX.1 Kontext. Here were the winners.
Learn expert prompting techniques to create stunning videos with Google's Veo 3.
We're sharing our experiments and tips on Google's new Veo 3 model.
Secure Minions is a secure protocol built by Stanford's Hazy Research lab to allow encrypted local-remote communication.
LoRA Fine-Tune Support Now Live on GroqCloud
Ollama now has the ability to enable or disable thinking. This gives users the flexibility to choose the model’s thinking behavior for different applications and use…
From Speed to Scale: How Groq Is Optimized for MoE & Other Large Models
How to Build Your Own AI Research Agent with One Groq API Call
Benchmark results for Qwen3 models using the Aider polyglot coding benchmark.
The $6.32 benchmark cost reported for Gemini 2.5 Pro Preview 03-25 was incorrect.
Official Llama API Now Fastest via Groq Inference