Ayushi Pandey

Trinity College Dublin

Ayushi Pandey is a PhD student at Trinity College Dublin, working under the supervision of Professor Naomi Harte and Dr Sebastien Le Maguer. In her ongoing PhD thesis, she explores the segmental properties of synthetic speech, and their relationship to perceived naturalness. Previously, she has worked on phonemic contrast, resource creation and forced-alignment for Indian languages.

Diving into divisions; segmental evaluation of Text-to-Speech Synthesizers

Segmental properties of Text-To-Speech (TTS) synthesizers have been studied for their influence on various perceived attributes of synthetic speech. However, they have received very limited attention for modern, neural vocoder-based TTS. In this talk, we will discuss that segmental evaluation of neural TTS synthesizers can prove much more diagnostic than conventional methods of TTS evaluation. Specifically,we will focus on the acoustic-phonetic measurements of obstruent consonants. First, from a production perspective, we will see how these measurements differ between the human and neural TTS voices. Then, we will explore whether human listeners are sensitive to these differences.