AI Feed

HF Daily Papers Papers 10 hr ago

Semantic Browsing: Controllable Diversity for Image Generation

Modern text-to-image models excel in visual fidelity and prompt adherence. However, this strict adherence comes at the cost of diversity: generated samples tend to collapse into…

HF Daily Papers Papers 10 hr ago

AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning

Large language models are increasingly deployed as agents that reason over documents rather than answer from parametric knowledge. We study archive-grounded reasoning: locating sparse evidence across…

r/LocalLLaMA Communities 10 hr ago

OpenAI and Broadcom unveil LLM-optimized inference chip

https://openai.com/index/openai-broadcom-jalapeno-inference-chip/ Quoted from the start of the blog post: Early testing shows that the first-generation accelerator will deliver performance per watt substantially better than current state-of-the-art…

HF Daily Papers Papers 10 hr ago

ChartWalker: Benchmarking the Cross-Chart RAG Task

Cross-Chart Retrieval-Augmented Generation (RAG) is critical for complex multi-modal analytical tasks in scientific, business, and political domains. However, existing benchmarks either focus on tables, which are…

X · @ylecun X / Twitter 10 hr ago

RT Randall Balestriero: It's a bird, it's a plane, it's a JEPA! Congrats on that great work that brought SIGReg and JEPAs to the sky–in the real worl…

RT Randall BalestrieroIt's a bird, it's a plane, it's a JEPA! Congrats on that great work that brought SIGReg and JEPAs to the sky--in the real…

HF Daily Papers Papers 10 hr ago

QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging

Attention-based Multiple Instance Learning aggregators in medical imaging are prone to attention concentration, producing overconfident and unstable predictions. We introduce QG-MIL, a gated transformer aggregator that…

HF Daily Papers Papers 10 hr ago

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies

Memory remains a critical bottleneck for long-horizon robotic manipulation, as standard Vision-Language-Action (VLA) policies often fail when task-relevant cues become occluded or unobservable over time. While…

r/LocalLLaMA Communities 10 hr ago

The Swiss Federal Supreme Court is evaluating Heretic

“Oh no, are they banning abliterated models now?!?” If that was your first thought when you read the title I can’t blame you. But that’s actually…

HF Daily Papers Papers 10 hr ago

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

Mental disorders are highly prevalent worldwide, but the shortage of psychiatrists and the inherent subjectivity of interview-based diagnosis create substantial barriers to timely and consistent mental-health…

HF Daily Papers Papers 10 hr ago

FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation

Generating explorable 3D scenes from a single image requires strong generative priors and accurate geometric representations suitable for downstream use. Current video diffusion models offer high-quality…

HF Daily Papers Papers 10 hr ago

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

AI agents are driving a new software paradigm, with the ability to autonomously call tools, extract information, manage memory, and complete tasks that span applications and…

HF Daily Papers Papers 10 hr ago

Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning

Experience-driven self-evolution is critical for large language model (LLM) agents to improve through open-world interaction. However, existing experience learning methods mostly rely on single-agent loops, where…

HF Daily Papers Papers 10 hr ago

FlowR2A: Learning Reward-to-Action Distribution for Multimodal Driving Planning

Multimodal driving planning faces a long-standing tension between two paradigms: scoring-based methods benefit from dense reward supervision but are confined to a fixed action vocabulary, while…

Hacker News (front page) Communities 10 hr ago

Show HN: Nub – A Bun-like all-in-one toolkit for Node.js

Article URL: https://github.com/nubjs/nub Comments URL: https://news.ycombinator.com/item?id=48660267 Points: 11 # Comments: 2

HF Daily Papers Papers 10 hr ago

An Efficient Method for the Optimal Control of Microgrids Under Uncertainties using Local Reduction

The problem of optimal sizing and power scheduling in microgrids subject to uncertainties is well known to the control community. Commonly, the optimal control problem is…

HF Daily Papers Papers 10 hr ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

We introduce NatureBench, a cross-discipline benchmark of 90 tasks distilled from peer-reviewed Nature-family publications, designed to evaluate whether AI coding agents can move beyond reproduction toward…

HF Daily Papers Papers 10 hr ago

DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

Dense retrieval embedding models are a fundamental component of modern retrieval-based AI systems. Most dense retrievers are trained with contrastive objectives, which require labeled positive and…

HF Daily Papers Papers 10 hr ago

FedOT: Ownership Verification and Leakage Tracing via Watermarks for Federated LDMs

Training Latent Diffusion Models (LDMs) within Federated Learning (FL) has attracted increasing attention due to its ability to combine the powerful generative capacity of LDMs with…

HF Daily Papers Papers 10 hr ago

ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Multimodal misinformation detection is increasingly important because viral posts now combine long multilingual narratives, several images, mixed provenance, and subtle text--image framing errors. Existing benchmarks and…

HF Daily Papers Papers 10 hr ago

Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning

The composition of training data, governed by the diversity of sources and their mixing strategy, is a cornerstone of Large Language Model (LLM) pre-training. Online Data…

HF Daily Papers Papers 10 hr ago

Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning

Text-to-image (T2I) generation models have achieved remarkable progress in producing visually realistic images from natural language prompts. Yet it remains unclear whether their success reflects genuine…

X · @teortaxesTex X / Twitter 10 hr ago

Just realized that this is GRPO-brained, generally ORM-brained dense process reward signal, in theory, would let you progress even if you do not have …

Just realized that this is GRPO-brained, generally ORM-braineddense process reward signal, in theory, would let you progress even if you do not have "positive trajectories". Of…

HF Daily Papers Papers 10 hr ago

World Value Models for Robotic Manipulation

Generalist value models play a pivotal role in scaling robotic policy learning from large-scale, mixed-quality data. Mathematically, accurate value estimation demands deep temporal understanding, requiring models…

HF Daily Papers Papers 11 hr ago

Qwen-AgentWorld: Language World Models for General Agents

A world model predicts environment dynamics based on current observations and actions, serving as a core cognitive mechanism for reasoning and planning. In this work, we…

Latest

Semantic Browsing: Controllable Diversity for Image Generation

AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning

OpenAI and Broadcom unveil LLM-optimized inference chip

ChartWalker: Benchmarking the Cross-Chart RAG Task

RT Randall Balestriero: It's a bird, it's a plane, it's a JEPA! Congrats on that great work that brought SIGReg and JEPAs to the sky–in the real worl…

QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies

The Swiss Federal Supreme Court is evaluating Heretic

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

FLAT: Feedforward Latent Triangle Splatting for Geometrically Accurate Scene Generation

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning

FlowR2A: Learning Reward-to-Action Distribution for Multimodal Driving Planning

Show HN: Nub – A Bun-like all-in-one toolkit for Node.js

An Efficient Method for the Optimal Control of Microgrids Under Uncertainties using Local Reduction

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

DREAM: Dense Retrieval Embeddings via Autoregressive Modeling

FedOT: Ownership Verification and Leakage Tracing via Watermarks for Federated LDMs

ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning

Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning

Just realized that this is GRPO-brained, generally ORM-brained dense process reward signal, in theory, would let you progress even if you do not have …

World Value Models for Robotic Manipulation

Qwen-AgentWorld: Language World Models for General Agents

Browse by category