News from arXiv cs.AI

arXiv cs.AI Papers 15 hr ago

Beyond Trajectory Imitation: Strategy-Guided Policy Optimization for LLM Reasoning

arXiv:2606.24064v1 Announce Type: new Abstract: Distilling reasoning capabilities from strong to weak language models typically involves imitating specific solution trajectories, effectively transferring what to answer rather…

arXiv cs.AI Papers 15 hr ago

QSignAI: Quantum-Randomness-Seeded Identity Signatures at the Intersection of AI for Science and Science for AI

arXiv:2605.27729v3 Announce Type: replace-cross Abstract: The 2024-2025 Nobel and Turing awards recognised AI and quantum science simultaneously. Yet no deployed system has brought these streams together…

arXiv cs.AI Papers 15 hr ago

Variational Model Merging for Pareto Front Estimation in Multitask Finetuning

arXiv:2412.08147v2 Announce Type: replace-cross Abstract: Pareto fronts are useful to find good task-mixing strategies for multitask finetuning, but they are also costly to compute. To reduce…

arXiv cs.AI Papers 15 hr ago

Exploring Academic Influence of Algorithms by Co-occurrence Network Based on Full-text of Academic Papers

arXiv:2606.24099v1 Announce Type: new Abstract: Algorithms have become central to scientific research in the era of artificial intelligence (AI). Although algorithm mentions in papers are often…

arXiv cs.AI Papers 15 hr ago

Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators

arXiv:2509.03647v2 Announce Type: replace-cross Abstract: Large language models (LLMs) increasingly serve as automated evaluators, yet they suffer from "self-preference bias": a tendency to favor their own…

arXiv cs.AI Papers 15 hr ago

SURGELLM: Rethinking Multi-Task Evaluation through Task-Aware Feature Gating with Class-Balanced Normalization

arXiv:2606.24259v1 Announce Type: cross Abstract: Fine-tuned encoders deployed across heterogeneous NLP tasks face three compounding problems: mismatched inductive biases, class-imbalance corruption of feature statistics, and no…

arXiv cs.AI Papers 15 hr ago

EG-VQA: Benchmarking Verifiable Video Question Answering with Grounded Temporal Evidence

arXiv:2606.24797v1 Announce Type: cross Abstract: Recent advances in Video Large Language Models (Video-LLMs) have yielded promising performance on video question answering (VideoQA). Nevertheless, existing benchmarks are…

arXiv cs.AI Papers 15 hr ago

ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

arXiv:2606.24112v1 Announce Type: new Abstract: Multimodal misinformation detection is increasingly important because viral posts now combine long multilingual narratives, several images, mixed provenance, and subtle text--image…

arXiv cs.AI Papers 15 hr ago

Subjective-Graph LLM Agents for Simulating Uncertainty in Classroom Social Perception

arXiv:2603.20750v2 Announce Type: replace Abstract: Social actors do not observe a common social world: each individual forms judgments from a partial and potentially distorted view of…

arXiv cs.AI Papers 15 hr ago

The African Language Tax: Quantifying the Cost, Latency, and Context Penalty of Tokenizing African Languages in Frontier LLMs

arXiv:2606.24460v1 Announce Type: cross Abstract: Commercial large language models bill, scale latency, and budget context per token. Yet tokenizers assign more subword tokens to the same…

Latest

Beyond Trajectory Imitation: Strategy-Guided Policy Optimization for LLM Reasoning

QSignAI: Quantum-Randomness-Seeded Identity Signatures at the Intersection of AI for Science and Science for AI

Variational Model Merging for Pareto Front Estimation in Multitask Finetuning

Exploring Academic Influence of Algorithms by Co-occurrence Network Based on Full-text of Academic Papers

Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators

SURGELLM: Rethinking Multi-Task Evaluation through Task-Aware Feature Gating with Class-Balanced Normalization

EG-VQA: Benchmarking Verifiable Video Question Answering with Grounded Temporal Evidence

ReMMD: Realistic Multilingual Multi-Image Agentic Verification for Multimodal Misinformation Detection

Subjective-Graph LLM Agents for Simulating Uncertainty in Classroom Social Perception

The African Language Tax: Quantifying the Cost, Latency, and Context Penalty of Tokenizing African Languages in Frontier LLMs