arXiv cs.AI
· Papers
How Do Tool-Augmented LLM Agents Perform on Real-World Energy Analytics Tasks?
arXiv:2606.26346v1 Announce Type: new Abstract: Agentic benchmarks have emerged across general-purpose and domain-specific settings, including finance, coding, law, and drug discovery, yet energy-domain evaluations remain largely limited to static knowledge recall. This is a critical gap for a sector that requires live