HF Daily Papers
· Papers
Speaker Identity in Non-Verbal Vocalizations: Conditional Distillation and Mixture of Experts Approach
As expressive text-to-speech (TTS) and voice conversion (VC) systems increasingly generate non-verbal vocalizations (NVVs) to enhance naturalness, reliable speaker verification (SV) becomes essential to objectively assess identity consistency across both verbal and non-verbal segments. Yet current S