Infrastructure news

Windsurf (Codeium) Infrastructure September 16, 2025

Windsurf Queued Messages Release

Windsurf now has Queued Messages!

Replicate Infrastructure September 8, 2025

Torch compile caching for inference speed

Cache your compiled models for faster boot and inference times

Groq Infrastructure September 4, 2025

Introducing the Next Generation of Compound on GroqCloud

Windsurf (Codeium) Infrastructure August 14, 2025

Windsurf Wave 12: Devin features in Windsurf

DeepWiki, Vibe and Replace, Dev Containers, and more!

Replicate Infrastructure August 10, 2025

Announcing Replicate's remote MCP server

Use our MCP to discover, compare, and run models from apps like Claude, Cursor, and VS Code.

Ollama Infrastructure August 5, 2025

OpenAI gpt-oss

Ollama partners with OpenAI to bring gpt-oss to Ollama and its community.

Replicate Infrastructure August 1, 2025

How to prompt Veo 3 with images

You'll be surprised what you can do with AI video now.

Replicate Infrastructure July 31, 2025

Open source video is back

Wan 2.2 is our fastest, cheapest video model.

Ollama Infrastructure July 30, 2025

Ollama's new app

Ollama's new app is now available for macOS and Windows.

Replicate Infrastructure July 21, 2025

Generate consistent characters

We compare the best image models for generating consistent characters from a single reference image.

Replicate Infrastructure July 17, 2025

Bria is now on Replicate

We've partnered with Bria to bring a suite of commercial-grade image generation and editing models to Replicate. Built entirely on licensed data, Bria’s tools are designed…

Replicate Infrastructure July 15, 2025

How we optimized FLUX.1 Kontext [dev]

A deep-dive into the Taylor Seer optimization technique

Replicate Infrastructure July 7, 2025

Compare AI video models

It's hard keeping up with every new video model. In this post we'll help you pick the best one for your needs.

Replicate Infrastructure July 1, 2025

The FLUX.1 Kontext hackathon

We hosted a hackathon with BFL for FLUX.1 Kontext. Here were the winners.

Replicate Infrastructure June 10, 2025

How to prompt Veo 3 for the best results

Learn expert prompting techniques to create stunning videos with Google's Veo 3.

Replicate Infrastructure June 5, 2025

Get the most from Google Veo 3

We're sharing our experiments and tips on Google's new Veo 3 model.

Ollama Infrastructure June 3, 2025

Secure Minions: private collaboration between Ollama and frontier models

Secure Minions is a secure protocol built by Stanford's Hazy Research lab to allow encrypted local-remote communication.

Groq Infrastructure June 3, 2025

LoRA Fine-Tune Support Now Live on GroqCloud

Ollama Infrastructure May 30, 2025

Thinking

Ollama now has the ability to enable or disable thinking. This gives users the flexibility to choose the model’s thinking behavior for different applications and use…

Groq Infrastructure May 27, 2025

From Speed to Scale: How Groq Is Optimized for MoE & Other Large Models

Groq Infrastructure May 16, 2025

How to Build Your Own AI Research Agent with One Groq API Call

Aider Infrastructure May 8, 2025

Qwen3 benchmark results

Benchmark results for Qwen3 models using the Aider polyglot coding benchmark.

Aider Infrastructure May 7, 2025

Gemini 2.5 Pro Preview 03-25 benchmark cost

The $6.32 benchmark cost reported for Gemini 2.5 Pro Preview 03-25 was incorrect.

Groq Infrastructure April 29, 2025

Official Llama API Now Fastest via Groq Inference

Infrastructure 333 stories