LESS IS MORE
LESS IS MORE
Home
Projects
Posts
Light
Dark
Automatic
Deep Learning
Improved Parallel WaveGAN with perceptually weighted spectrogram loss
Preprint:
arXiv:2101.07412
(accepted to
SLT 2021
)
Eunwoo Song
,
Ryuichi Yamamoto
,
Min-Jae Hwang
,
Jin-Seob Kim
,
Ohsung Kwon
,
Jae-Min Kim
TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis
Preprint:
arXiv:2010.13421
(accepted to
ICASSP 2021
)
Min-Jae Hwang
,
Ryuichi Yamamoto
,
Eunwoo Song
,
Jae-Min Kim
Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Preprint:
arXiv:2010.14151
(accepted to
ICASSP 2021
)
Ryuichi Yamamoto
,
Eunwoo Song
,
Min-Jae Hwang
,
Jae-Min Kim
NNSVS: Pytorchベースの研究用歌声合成ライブラリ
Neural network based singing voice synthesis:
https://github.com/r9y9/nnsvs
May 10, 2020
1 min read
Neural text-to-speech with a modeling-by-generation excitation vocoder
Preprint:
arXiv:2008.00132
, Published version:
ISCA Archive Interspeech 2020
Eunwoo Song
,
Min-Jae Hwang
,
Ryuichi Yamamoto
,
Jin-Seob Kim
,
Ohsung Kwon
,
Jae-Min Kim
End-to-End 音声合成の研究を加速させるツールキット ESPnet-TTS / ESPnet-TTS: A toolkit to accelerate research on end-to-end speech synthesis @ ASJ 2020s
Mar 16, 2020 1:00 PM — 1:30 PM
Tomoki Hayashi
,
Ryuichi Yamamoto
,
Katsuki Inoue
,
Takenori Yoshimura
,
Kazuya Takemura
,
Tomoki Toda
,
Shinji Watanabe
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Preprint:
arXiv:1910.10909
(submitted to
ICASSP 2020
)
Tomoki Hayashi
,
Ryuichi Yamamoto
,
Katsuki Inoue
,
Takenori Yoshimura
,
Shinji Watanabe
,
Tomoki Toda
,
Kazuya Takeda
,
Yu Zhang
,
Xu Tan
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Preprint:
arXiv:1910.11480
(accepted to
ICASSP 2020
)
Ryuichi Yamamoto
,
Eunwoo Song
,
Jae-Min Kim
Probability Density Distillation with Generative Adversarial Networks for High-Quality Parallel Waveform Generation
Preprint:
arXiv:1904.04472
, Published version:
ISCA Archive Interspeech 2019
Ryuichi Yamamoto
,
Eunwoo Song
,
Jae-Min Kim
WN-based TTSやりました / Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions [arXiv:1712.05884]
Audio samples:
https://r9y9.github.io/wavenet_vocoder/
May 20, 2018
3 min read
«
»
Cite
×