Active learning for the prediction of prosodic phrase

Prosodic structure contributes to speech production and comprehension. One of the crucial problems in achieving natural-sounding synthesized speech is the prediction of appropriate phrase boundaries. Unfortunately, obtaining human annotations of prosodic phrases to train a supervised system can be laborious and costly. Active learning has been proven effective in reducing labeling efforts for supervised learning.

This study explores active learning techniques with the objective to reduce the amount of human-annotated data needed to attain a given level of performance. It presents an approach based on active learning to predict the Chinese prosodic phrase boundaries in unrestricted Chinese text. Experiments show that for most of the cases considered, the active selection strategies for labeling the prosodic phrase boundaries are as good as or exceed the performance of random data selection.

Share this post

Recommended for You

New Beamforming and Relay Selection for Two-Way Decode-and-Forward Relay Networks

Argo: A Real-Time Network-on-Chip Architecture With an Efficient GALS Implementation

Shaping physical machine topology in distributed data center networks

Integrated test concepts for in-situ millimeter-wave device characterization