arXiv stat.ML
· Papers
Semiparametric Off-Policy Inference for Optimal Policy Values under Possible Non-Uniqueness
arXiv:2505.13809v5 Announce Type: replace-cross Abstract: Off-policy evaluation (OPE) constructs confidence intervals for the value of a target policy using data generated under a different behavior policy. Most existing inference methods focus on fixed target policies and may fail when the target policy is estimated a