Skip to content
arXiv stat.ML · Papers

Semiparametric Off-Policy Inference for Optimal Policy Values under Possible Non-Uniqueness

arXiv:2505.13809v5 Announce Type: replace-cross Abstract: Off-policy evaluation (OPE) constructs confidence intervals for the value of a target policy using data generated under a different behavior policy. Most existing inference methods focus on fixed target policies and may fail when the target policy is estimated a