X / Twitter news · AI Feed

X · @demishassabis X / Twitter May 25, 2026

RT F.O.L.A: Google Omni might be too powerful 🫥

RT F.O.L.AGoogle Omni might be too powerful 🫥

X · @soumithchintala X / Twitter May 24, 2026

RT Garry Tan: Thinking Machines is impressive. In a couple hours I just fine tuned my own Qwen3.5-397B model this afternoon. Fast usable multimodal is…

RT Garry TanThinking Machines is impressive. In a couple hours I just fine tuned my own Qwen3.5-397B model this afternoon. Fast usable multimodal is also going…

X · @rasbt X / Twitter May 23, 2026

Added a DeepSeek Sparse Attention (DSA) from-scratch implementation to my LLMs-from-scratch repo thanks to an awesome new reader contrib. With motivat…

Added a DeepSeek Sparse Attention (DSA) from-scratch implementation to my LLMs-from-scratch repo thanks to an awesome new reader contrib. With motivation, overview, and GPT-style model reference…

X · @AndrewYNg X / Twitter May 22, 2026

The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt fami…

The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt families,…

X · @AndrewYNg X / Twitter May 22, 2026

Harvard University just voted to limit the number of A grades given in undergraduate classes to about 20% of the class. I’m not in favor of this. It …

Harvard University just voted to limit the number of A grades given in undergraduate classes to about 20% of the class. I’m not in favor of…

X · @demishassabis X / Twitter May 22, 2026

RT Google DeepMind: Project Genie 🤝 @GoogleMaps Street View You can now take real U.S. places and transform them into new, interactive worlds. 🌍

RT Google DeepMindProject Genie 🤝 @GoogleMaps Street ViewYou can now take real U.S. places and transform them into new, interactive worlds. 🌍

X · @rasbt X / Twitter May 20, 2026

It's been almost a bit quiet around LLM architecture releases in the past two weeks 😅 Interesting tidbit is the parallel block design. Via the Cm…

It's been *almost* a bit quiet around LLM architecture releases in the past two weeks 😅Interesting tidbit is the parallel block design. Via the Cmd-A the…

X · @AndrewYNg X / Twitter May 20, 2026

New course: Build AI agents that generate images and videos — an under-explored frontier. A key to performance is having the agent evaluate its own o…

New course: Build AI agents that generate images and videos -- an under-explored frontier. A key to performance is having the agent evaluate its own output,…

X · @miramurati X / Twitter May 19, 2026

Collaborative AI runs on interactivity: machines and people, working in real time, across every modality. Solving it takes a community, join us.

Collaborative AI runs on interactivity: machines and people, working in real time, across every modality. Solving it takes a community, join us.Thinking Machines: We are offering…

X · @lilianweng X / Twitter May 19, 2026

We would love to see more collaboration and research in the field of human-AI interactivity. Check it out!

We would love to see more collaboration and research in the field of human-AI interactivity. Check it out!Thinking Machines: We are offering grants of $100,000 +…

X · @karpathy X / Twitter May 19, 2026

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the…

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join…

X · @lilianweng X / Twitter May 18, 2026

I only recently read more about the concept of system accidents by Charles Perrow, very insightful and relatable.

X · @rasbt X / Twitter May 16, 2026

New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4. I focus on long-context efficiency tweaks like KV sharing…

New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4.I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings,…

X · @AndrewYNg X / Twitter May 14, 2026

New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose…

New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose problems like slow…

X · @soumithchintala X / Twitter May 13, 2026

more demos on Interaction Models collaboratively doing system design, reading papers, fact-checking with live generative UI

more demos on Interaction Models collaboratively doing system design, reading papers, fact-checking with live generative UISeongsik Kim: 1. (System design) - The Interaction Models see your…

X · @rasbt X / Twitter May 13, 2026

A little talk on what we can learn from implementing LLM architectures from scratch in Python and PyTorch. And how I approach new open-weight models, …

A little talk on what we can learn from implementing LLM architectures from scratch in Python and PyTorch. And how I approach new open-weight models, compare…

X · @rasbt X / Twitter May 13, 2026

Interesting paper. What I like about this is that it is a relatively low-commitment attention modification. I.e., one can use it during most of traini…

Interesting paper. What I like about this is that it is a relatively low-commitment attention modification.I.e., one can use it during most of training, switch back…

X · @soumithchintala X / Twitter May 12, 2026

Cluster magicians and GPU whisperers, come join us! We’re looking for supercomputing engineers to build the infrastructure behind real-time interacti…

Cluster magicians and GPU whisperers, come join us!We’re looking for supercomputing engineers to build the infrastructure behind real-time interactive models, Tinker, and large-scale training: scheduling, storage,…

X · @miramurati X / Twitter May 12, 2026

RT Soumith Chintala: Cluster magicians and GPU whisperers, come join us! We’re looking for supercomputing engineers to build the infrastructure behin…

RT Soumith ChintalaCluster magicians and GPU whisperers, come join us!We’re looking for supercomputing engineers to build the infrastructure behind real-time interactive models, Tinker, and large-scale training:…

X · @AndrewYNg X / Twitter May 12, 2026

There will be no AI jobpocalypse. The story that AI will lead to massive unemployment is stoking unnecessary fear. AI — like any other technology — …

There will be no AI jobpocalypse.The story that AI will lead to massive unemployment is stoking unnecessary fear. AI — like any other technology — does…

X · @lilianweng X / Twitter May 12, 2026

RT Seongsik Kim: Re 2. (Real time fact checking) – The Interaction Models hear you speak and fact-checks you in real time — like having a teammate wh…

RT Seongsik KimRe 2. (Real time fact checking) - The Interaction Models hear you speak and fact-checks you in real time — like having a teammate…

X · @miramurati X / Twitter May 12, 2026

RT Seongsik Kim: Re 1. (System design) – The Interaction Models see your screen and collaborates with you live. Here we're building a scalable system …

RT Seongsik KimRe 1. (System design) - The Interaction Models see your screen and collaborates with you live. Here we're building a scalable system architecture together…

X · @soumithchintala X / Twitter May 11, 2026

RT Scale Labs: Congrats to @thinkymachines on the release of TML-Interaction-Small and tying for the top spot on our Audio MC S2S leaderboard! 🥇 Th…

RT Scale LabsCongrats to @thinkymachines on the release of TML-Interaction-Small and tying for the top spot on our Audio MC S2S leaderboard! 🥇Their interaction model scores…

X · @soumithchintala X / Twitter May 11, 2026

RT Horace He: In modern ML accelerators, FLOPS have absolutely exploded. Often though, the bottleneck is not FLOPS but memory bandwidth. Similarly, mo…

RT Horace HeIn modern ML accelerators, FLOPS have absolutely exploded. Often though, the bottleneck is not FLOPS but memory bandwidth. Similarly, model intelligence has exploded, causing…

X / Twitter 430 stories

RT F.O.L.A: Google Omni might be too powerful 🫥

RT Garry Tan: Thinking Machines is impressive. In a couple hours I just fine tuned my own Qwen3.5-397B model this afternoon. Fast usable multimodal is…

Added a DeepSeek Sparse Attention (DSA) from-scratch implementation to my LLMs-from-scratch repo thanks to an awesome new reader contrib. With motivat…

The new White House policy requiring green card applicants to apply from outside the US is a capricious attack on legal immigration. It will hurt fami…

Harvard University just voted to limit the number of A grades given in undergraduate classes to about 20% of the class. I’m not in favor of this. It …

RT Google DeepMind: Project Genie 🤝 @GoogleMaps Street View You can now take real U.S. places and transform them into new, interactive worlds. 🌍

It's been *almost* a bit quiet around LLM architecture releases in the past two weeks 😅 Interesting tidbit is the parallel block design. Via the Cm…

New course: Build AI agents that generate images and videos — an under-explored frontier. A key to performance is having the agent evaluate its own o…

Collaborative AI runs on interactivity: machines and people, working in real time, across every modality. Solving it takes a community, join us.

We would love to see more collaboration and research in the field of human-AI interactivity. Check it out!

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the…

I only recently read more about the concept of system accidents by Charles Perrow, very insightful and relatable.

New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4. I focus on long-context efficiency tweaks like KV sharing…

New course: Transformers in Practice. You'll get a practical view of how transformer-based LLMs work, so you can reason about their behavior, diagnose…

more demos on Interaction Models collaboratively doing system design, reading papers, fact-checking with live generative UI

A little talk on what we can learn from implementing LLM architectures from scratch in Python and PyTorch. And how I approach new open-weight models, …

Interesting paper. What I like about this is that it is a relatively low-commitment attention modification. I.e., one can use it during most of traini…

Cluster magicians and GPU whisperers, come join us! We’re looking for supercomputing engineers to build the infrastructure behind real-time interacti…

RT Soumith Chintala: Cluster magicians and GPU whisperers, come join us! We’re looking for supercomputing engineers to build the infrastructure behin…

There will be no AI jobpocalypse. The story that AI will lead to massive unemployment is stoking unnecessary fear. AI — like any other technology — …

RT Seongsik Kim: Re 2. (Real time fact checking) – The Interaction Models hear you speak and fact-checks you in real time — like having a teammate wh…

RT Seongsik Kim: Re 1. (System design) – The Interaction Models see your screen and collaborates with you live. Here we're building a scalable system …

RT Scale Labs: Congrats to @thinkymachines on the release of TML-Interaction-Small and tying for the top spot on our Audio MC S2S leaderboard! 🥇 Th…

RT Horace He: In modern ML accelerators, FLOPS have absolutely exploded. Often though, the bottleneck is not FLOPS but memory bandwidth. Similarly, mo…

It's been almost a bit quiet around LLM architecture releases in the past two weeks 😅 Interesting tidbit is the parallel block design. Via the Cm…