Interspeech

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

Accepted to Interspeech 2024.

Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata, Takuya Hasumi, Kentaro Tachibana

CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection

Accepted to Interspeech 2024.

Yongyi Zang, Jiatong Shi, You Zhang, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Shengyuan Xu, Wenxiao Zhao, Jing Guo, Tomoki Toda, Zhiyao Duan

Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment

Accepted to Interspeech 2024.

Takuto Igarashi, Yuki Saito, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari

SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark

Accepted to Interspeech 2024.

Yuki Saito, Takuto Igarashi, Kentaro Seki, Shinnosuke Takamichi, Ryuichi Yamamoto, Kentaro Tachibana, Hiroshi Saruwatari

Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data

Accepted to Interspeech 2024.

Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana

DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning

Accepted to Interspeech 2022

Takaaki Saeki, Kentaro Tachibana, Ryuichi Yamamoto

TTS-by-TTS 2: Data-selective Augmentation for Neural Speech Synthesis Using Ranking Support Vector Machine with Variational Autoencoder

Accepted to Interspeech 2022

Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Accepted to Interspeech 2022

Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana

A Unified Accent Estimation Method Based on Multi-Task Learning for Japanese Text-to-Speech

Accepted to Interspeech 2022

Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

Accepted to Interspeech 2022

Hyunwook Yoon, Ohsung Kwon, Hoyeon Lee, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim, Min-Jae Hwang