Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data Yuma Shirahata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana Jun 16, 2024 Go to Project Site Interspeech Ryuichi Yamamoto Engineer/Researcher I am a engineer/researcher passionate about speech synthesis. I love to write code and enjoy open-source collaboration on GitHub. Please feel free to reach out on Twitter and GitHub. Related LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning