I don't like the results so now the benchmark sucks.
I don't like the results so now the benchmark sucks.LONGTERMONLY: @teortaxesTex You don't like the results so now the benchmark sucks?
Every story across every category, newest first. Each card links to the original publisher; daily-brief posts open as editorial pages.
I don't like the results so now the benchmark sucks.LONGTERMONLY: @teortaxesTex You don't like the results so now the benchmark sucks?
I got one of these refurbished units last year. I have nothing but good things to say about it. It works great. It has heft to…
Every second of a professional basketball game now generates more than 20,000 data...
Article URL: https://www.cwu.org/press_release/wikipedia-workers-to-seek-union-recognition/ Comments URL: https://news.ycombinator.com/item?id=48663861 Points: 21 # Comments: 3
In this post, we walk through how Huntington built a scalable AWS solution to detect and redact Personally Identifiable Information (PII) and Payment Card Industry (PCI)…
Article URL: https://www.ycombinator.com/library/SF-how-to-get-your-first-10-customers Comments URL: https://news.ycombinator.com/item?id=48663819 Points: 55 # Comments: 16
Do the Japanese stereotype Chinese beauties as "tall, small boobs, long legs"? And they're saying this as if it's something bad?ミス東大2026 No.5 曾庭榕: こんばんは🌙実験の休憩中の私と、ゴルフをしている私⛳️みなさんは普段、どうやってストレス発散していますか?😊私は時間があると友達とゴルフに行って、自然の中でリフレッシュしています🍃でも、なかなか上達しません…😂#東京大学#ミスコン
This is happening throughout the WestStephen Miller: Change the voters, change the country.
In this post, you will learn how to build a voice agent that handles appointment reminder conversations using Amazon Nova 2 Sonic and Amazon Bedrock AgentCore.…
In this post, you will learn how to build an end-to-end integration between Snowflake semantic views and Amazon Quick. The sample data is user review data…
At the end of my last post, I presented an idea: what if I used the core of my last project, the cumulative matrix product, and…
GLM 5.2 is the best Chinese model on ARC-AGI-2, at 22.8% (is that high or max?), on par with Opus 4.5 (16K). …Whereas Grok 4.20 is…
RT Jo Kristian BergumI’m headed to San Francisco for AI Engineer World's Fair next week. Two things. I'm giving a talk Tuesday, June 30: "The unreasonable…
In the last few months, I've started to see [job applications] that were clearly cowritten by an LLM, link to an LLM-generated portfolio site, which then…
RT Derya Unutmaz, MDI’m very excited about this article from @OpenAI on my attempt to use GPT-5 Pro to understand the results of an experiment we…
Big improvements to GPT-5.5 Instant, including being much more fun to talk to. Give it a try:OpenAI: We have a new version of GPT-5.5 Instant for…
Zyphra fits a scaling law for plasticity loss in continuously trained LLMs. What can we do to push the point of rigidity onset towards infinity? I…
Article URL: https://posthog.com/blog/sql-parser Comments URL: https://news.ycombinator.com/item?id=48663544 Points: 18 # Comments: 1
A much needed data release! Excited to tinker with the data.Richard Zhuang: How can we train small agentic models that are highly capable of terminal use…
We have a new version of GPT-5.5 Instant for you, and it's much more fun to talk to.Our most-used model is now better at understanding the…
Agentic data engineering is changing how pipelines are builtDaikin Applied Americas...
So Microsoft gives you GPT-4 for free in Copilot. They just don't give you an API for it. So I made one. It logs into your…
one of the two must stay in the lockereither Fable, or DarioDario is a weirdo, so Fable gets to walkTeortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞): Life…
RT clem 🤗Kog open-sourced on @huggingface the 2B model that they used to show a model running at 3,000+ tokens per second. Very cool work! https://huggingface.co/blog/kogai/kog-laneformer-2b-the-latency-first-model