Four New GA Features for Dedicated Read Nodes That Give Teams More Control and Observability
Four New GA Features for Dedicated Read Nodes That Give Teams More Control and Observability
Four New GA Features for Dedicated Read Nodes That Give Teams More Control and Observability
Parcae is a stable looped language model that matches the quality of a Transformer twice its size — a 770M model reaching 1.3B-level performance. We introduce…
Weaviate Shared Cloud is now generally available on AWS in US East and Europe, giving teams a fully managed, AI-native database on the provider and region…
Load Balancing AI Services for Availability and Speed
EinsteinArena is a platform where AI agents collaborate and compete on open math problems. AI agents on EinsteinArena have already set 11 new state-of-the-art results on…
skip to contentGroqPlatform Arrow pointing downGroqCloudLPU ArchitectureSee PricingSolutions Arrow pointing downIndustries & Use CasesCustomer StoriesDemosLearn Arrow pointing downBlogWhitepapersNewsroomChangelogSubscribePricingDevelopers Arrow pointing downFree API keyCommunityDocsEnterprisesStart BuildingMenuToggle M
AI-native companies need infrastructure built for models, not legacy workloads. Learn what defines an AI Native Cloud and why it matters for the next platform shift.
We're launching multiple updates to Windsurf today: an Adaptive model router, a redesigned model picker with pricing context, and the removal of daily limits for Max.
New research shows LLMs can optimize database query execution plans—achieving up to 4.78x speedups by correcting the cardinality estimation errors that statistical heuristics miss.
A four-model video suite for generation, continuation, reference-driven workflows, and editing, rolling out on Together AI starting with text-to-video.
Two weeks of dogfooding Engram, Weaviate's memory product, in daily Claude Code sessions. This surfaced where a dedicated memory product adds value, and the specific mechanics…
Pinecone Assistant: A Managed Knowledge Layer for Production AI Applications
Production STT and TTS from Deepgram, available on Together AI Dedicated Model Inference for real-time voice agents.
Cursor 3 is a unified workspace for building software with agents.
Multimodal embeddings allow AI systems to search and reason across text, images, audio, and video in their native formats. This blog covers the key intuitions behind…
The team behind FlashAttention and ThunderKittens — how Together AI's kernel researchers close the gap between GPU hardware and production AI.
Use semantic search and RAG in C# with the Weaviate Managed .NET client — attribute-driven schema, type-safe queries, and safe migrations, all in idiomatic .NET.
Today, we're previewing the fastest way to run Ollama on Apple silicon, powered by MLX, Apple's machine learning framework.
AI is the defining technology of our time, quickly becoming core business infrastructure. It’s fueled by a diverse ecosystem of models: large and small, open and…
A complete guide on how to secure Weaviate enterprise deployments with OIDC, RBAC, and multi-tenant isolation.
We're simplifying Windsurf pricing across Free, Pro, and Teams alongside launching a new Max plan for our power users. The new plans replace the current credit-based…
Launched today, NVIDIA Nemotron 3 Super is a 120‑billion‑parameter open model with 12 billion active parameters designed to run complex agentic AI systems at scale. Available…
GPT-5.4 is now available in Windsurf with multiple reasoning effort levels. For a limited time, self serve users enjoy promotional pricing starting at 1x credits.
Garbage Day: How Pinecone Safely Deletes Billions of Objects at Scale