Infrastructure news

Pinecone Infrastructure April 15, 2026

Four New GA Features for Dedicated Read Nodes That Give Teams More Control and Observability

Together AI blog Infrastructure April 15, 2026

Parcae: Doing more with fewer parameters using stable looped models

Parcae is a stable looped language model that matches the quality of a Transformer twice its size — a 770M model reaching 1.3B-level performance. We introduce…

Weaviate Infrastructure April 15, 2026

Weaviate Shared Cloud now generally available on AWS

Weaviate Shared Cloud is now generally available on AWS in US East and Europe, giving teams a fully managed, AI-native database on the provider and region…

Pinecone Infrastructure April 14, 2026

Load Balancing AI Services for Availability and Speed

Together AI blog Infrastructure April 13, 2026

EinsteinArena: Harnessing the collective intelligence of agents in the wild to advance science

EinsteinArena is a platform where AI agents collaborate and compete on open math problems. AI agents on EinsteinArena have already set 11 new state-of-the-art results on…

Groq (via openrss) Infrastructure April 9, 2026

Canopy Labs` Orpheus TTS is live on GroqCloud

skip to contentGroqPlatform Arrow pointing downGroqCloudLPU ArchitectureSee PricingSolutions Arrow pointing downIndustries & Use CasesCustomer StoriesDemosLearn Arrow pointing downBlogWhitepapersNewsroomChangelogSubscribePricingDevelopers Arrow pointing downFree API keyCommunityDocsEnterprisesStart BuildingMenuToggle M

Together AI blog Infrastructure April 7, 2026

What is an AI Native Cloud?

AI-native companies need infrastructure built for models, not legacy workloads. Learn what defines an AI Native Cloud and why it matters for the next platform shift.

Windsurf (Codeium) Infrastructure April 6, 2026

Introducing Adaptive: a smarter way to use Windsurf

We're launching multiple updates to Windsurf today: an Adaptive model router, a redesigned model picker with pricing context, and the removal of daily limits for Max.

Together AI blog Infrastructure April 3, 2026

AI for Systems: Using LLMs to Optimize Database Query Execution

New research shows LLMs can optimize database query execution plans—achieving up to 4.78x speedups by correcting the cardinality estimation errors that statistical heuristics miss.

Together AI blog Infrastructure April 3, 2026

Wan 2.7 video model suite now available on Together AI

A four-model video suite for generation, continuation, reference-driven workflows, and editing, rolling out on Together AI starting with text-to-video.

Weaviate Infrastructure April 2, 2026

Oh Memories, Where'd You Go

Two weeks of dogfooding Engram, Weaviate's memory product, in daily Claude Code sessions. This surfaced where a dedicated memory product adds value, and the specific mechanics…

Pinecone Infrastructure April 2, 2026

Pinecone Assistant: A Managed Knowledge Layer for Production AI Applications

Together AI blog Infrastructure April 2, 2026

Deepgram speech-to-text and voice models now available natively on Together AI

Production STT and TTS from Deepgram, available on Together AI Dedicated Model Inference for real-time voice agents.

Cursor Infrastructure April 2, 2026

Meet the new Cursor

Cursor 3 is a unified workspace for building software with agents.

Weaviate Infrastructure April 1, 2026

Multimodal Embeddings and RAG: A Practical Guide

Multimodal embeddings allow AI systems to search and reason across text, images, audio, and video in their native formats. This blog covers the key intuitions behind…

Together AI blog Infrastructure April 1, 2026

Inside the Together AI kernels team

The team behind FlashAttention and ThunderKittens — how Together AI's kernel researchers close the gap between GPU hardware and production AI.

Weaviate Infrastructure March 31, 2026

Your Code is Your Schema: Weaviate Managed C# Client

Use semantic search and RAG in C# with the Weaviate Managed .NET client — attribute-driven schema, type-safe queries, and safe migrations, all in idiomatic .NET.

Ollama (via openrss) Infrastructure March 30, 2026

Ollama is now powered by MLX on Apple Silicon in preview

Today, we're previewing the fastest way to run Ollama on Apple silicon, powered by MLX, Apple's machine learning framework.

NVIDIA Nemotron Infrastructure March 25, 2026

The Future of AI Is Open and Proprietary

AI is the defining technology of our time, quickly becoming core business infrastructure. It’s fueled by a diverse ecosystem of models: large and small, open and…

Weaviate Infrastructure March 19, 2026

Securing Enterprise AI with Weaviate

A complete guide on how to secure Weaviate enterprise deployments with OIDC, RBAC, and multi-tenant isolation.

Windsurf (Codeium) Infrastructure March 18, 2026

Introducing our new Windsurf pricing plans

We're simplifying Windsurf pricing across Free, Pro, and Teams alongside launching a new Max plan for our power users. The new plans replace the current credit-based…

NVIDIA Nemotron Infrastructure March 11, 2026

New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI

Launched today, NVIDIA Nemotron 3 Super is a 120‑billion‑parameter open model with 12 billion active parameters designed to run complex agentic AI systems at scale. Available…

Windsurf (Codeium) Infrastructure March 5, 2026

GPT-5.4 is now available in Windsurf

GPT-5.4 is now available in Windsurf with multiple reasoning effort levels. For a limited time, self serve users enjoy promotional pricing starting at 1x credits.

Pinecone Infrastructure March 5, 2026

Garbage Day: How Pinecone Safely Deletes Billions of Objects at Scale

Infrastructure 335 stories