A pitch-synchronous speech analysis and synthesis method for DNN-SPSS system

Jin Seob Kim, Young Sun Joo, Hong Goo Kang, Inseon Jang, Chunghyun Ahn, Jeongil Seo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper proposes a pitch-synchronous deep neural network (DNN)-based statistical parametric speech synthesis (SPSS) system. The pitch-synchronous frames defined by the locations of glottal closure instants (GCIs) are used to extract speech parameters, which significantly reduce coupling effects between vocal tract and excitation signals. As a result, the distribution of spectral parameters within the same context of phonetic classes becomes more uniform, which improves a model trainability especially for a small-scaled DNN framework. Although the effectiveness of pitch-synchronous approach has been proven in other applications, it is not trivial to integrate the method into the typical DNN-based SPSS systems that have regularized structures, i.e. fixed frame rate and fixed dimension of features. In this paper, we design a new DNN-based SPSS system that pitch-synchronously trains and generates speech parameters. Objective and subjective test results verify the superiority of the proposed system compared to the conventional approach.

Original languageEnglish
Title of host publicationProceedings - 2016 IEEE International Conference on Digital Signal Processing, DSP 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages408-411
Number of pages4
ISBN (Electronic)9781509041657
DOIs
Publication statusPublished - 2016 Jul 2
Event2016 IEEE International Conference on Digital Signal Processing, DSP 2016 - Beijing, China
Duration: 2016 Oct 162016 Oct 18

Publication series

NameInternational Conference on Digital Signal Processing, DSP
Volume0

Other

Other2016 IEEE International Conference on Digital Signal Processing, DSP 2016
Country/TerritoryChina
CityBeijing
Period16/10/1616/10/18

All Science Journal Classification (ASJC) codes

  • Signal Processing

Fingerprint

Dive into the research topics of 'A pitch-synchronous speech analysis and synthesis method for DNN-SPSS system'. Together they form a unique fingerprint.

Cite this