RT F.O.L.A: Google Omni might be too powerful 🫥
RT F.O.L.AGoogle Omni might be too powerful 🫥
RT F.O.L.AGoogle Omni might be too powerful 🫥
RT Garry TanThinking Machines is impressive. In a couple hours I just fine tuned my own Qwen3.5-397B model this afternoon. Fast usable multimodal is also going…
Added a DeepSeek Sparse Attention (DSA) from-scratch implementation to my LLMs-from-scratch repo thanks to an awesome new reader contrib. With motivation, overview, and GPT-style model reference…
The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt families,…
Harvard University just voted to limit the number of A grades given in undergraduate classes to about 20% of the class. I’m not in favor of…
RT Google DeepMindProject Genie 🤝 @GoogleMaps Street ViewYou can now take real U.S. places and transform them into new, interactive worlds. 🌍
It's been *almost* a bit quiet around LLM architecture releases in the past two weeks 😅Interesting tidbit is the parallel block design. Via the Cmd-A the…
New course: Build AI agents that generate images and videos -- an under-explored frontier. A key to performance is having the agent evaluate its own output,…
Collaborative AI runs on interactivity: machines and people, working in real time, across every modality. Solving it takes a community, join us.Thinking Machines: We are offering…
We would love to see more collaboration and research in the field of human-AI interactivity. Check it out!Thinking Machines: We are offering grants of $100,000 +…
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join…
I only recently read more about the concept of system accidents by Charles Perrow, very insightful and relatable.
New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4.I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings,…
New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose problems like slow…
more demos on Interaction Models collaboratively doing system design, reading papers, fact-checking with live generative UISeongsik Kim: 1. (System design) - The Interaction Models see your…
A little talk on what we can learn from implementing LLM architectures from scratch in Python and PyTorch. And how I approach new open-weight models, compare…
Interesting paper. What I like about this is that it is a relatively low-commitment attention modification.I.e., one can use it during most of training, switch back…
Cluster magicians and GPU whisperers, come join us!We’re looking for supercomputing engineers to build the infrastructure behind real-time interactive models, Tinker, and large-scale training: scheduling, storage,…
RT Soumith ChintalaCluster magicians and GPU whisperers, come join us!We’re looking for supercomputing engineers to build the infrastructure behind real-time interactive models, Tinker, and large-scale training:…
There will be no AI jobpocalypse.The story that AI will lead to massive unemployment is stoking unnecessary fear. AI — like any other technology — does…
RT Seongsik KimRe 2. (Real time fact checking) - The Interaction Models hear you speak and fact-checks you in real time — like having a teammate…
RT Seongsik KimRe 1. (System design) - The Interaction Models see your screen and collaborates with you live. Here we're building a scalable system architecture together…
RT Scale LabsCongrats to @thinkymachines on the release of TML-Interaction-Small and tying for the top spot on our Audio MC S2S leaderboard! 🥇Their interaction model scores…
RT Horace HeIn modern ML accelerators, FLOPS have absolutely exploded. Often though, the bottleneck is not FLOPS but memory bandwidth. Similarly, model intelligence has exploded, causing…