0.0.1
Notes
General design documentation
The underlying design philosophy
Background
Goal
So what do we provide?
Design decisions
Development guidelines
設計ドキュメント (Japanese)
The underlying design philosophy
Background
Goal
So what do we provide?
Design decisions
Development guidelines
A quick start guide
Playing with audio and it’s alignment file
Load wav file
Acoustic features
Load aligment file
Cut silence frames
Linguistic features
Playing with datasets
Get example file sources
Load data
Utterance-wise iteration
Memory cache iteration
Frame-wise iteration
Tutorials
DNN text-to-speech synthesis (en)
Data
Data specification
File data sources
Utterance lengths
How data look like?
Statistics
Combine datasets and normalization.
Model
Train
Configurations
Training loop
Define models
Training Duration model
Training acoustic model
Test
Parameter generation utilities
Listen generated audio
Bidirectional-LSTM based RNNs for text-to-speech synthesis (en)
Data
Data specification
File data sources
Utterance lengths
How data look like?
Statistics
Combine datasets and normalization.
Model
Train
Configurations
Trainining loop
Define models
Training Duration model
Training acoustic model
Test
Parameter generation utilities
Listen generated audio
GMM-based voice conversion (en)
Data
Data specification
File data sources
Convert dataset to arrays
How data look like?
Align source and target features
How parallel data look like?
Append delta features
Finally, we get joint feature matrix
Model
Visualize model
Means
Covariances
Test
Listen results
How different?
Package references
Autograd
Functional interface
nnmnkwii.autograd.mlpg
nnmnkwii.autograd.modspec
Function classes
Baseline
GMM voice conversion
Datasets
Interface
Implementation
Dataset that supports utterance-wise iteration
Dataset that supports frame-wise iteration
Display
Frontend
Merlin frontend
nnmnkwii.frontend.merlin.linguistic_features
nnmnkwii.frontend.merlin.duration_features
Functions
nnmnkwii.functions.mlpg
nnmnkwii.functions.modspec
nnmnkwii.functions.modphase
IO
HTS IO
nnmnkwii.io.hts.load
nnmnkwii.io.hts.load_question_set
Evaluation metrics
nnmnkwii.metrics.melcd
Post-filters
nnmnkwii.postfilters.merlin_post_filter
Pre-processing
Alignment
F0
nnmnkwii.preprocessing.f0.interp1d
Utilities
Utterance-wise operations
nnmnkwii.util.delta
nnmnkwii.util.apply_delta_windows
nnmnkwii.util.trim_zeros_frames
nnmnkwii.util.remove_zeros_frames
nnmnkwii.util.adjast_frame_length
nnmnkwii.util.scale
nnmnkwii.util.minmax_scale
Dataset-wise operations
nnmnkwii.util.meanvar
nnmnkwii.util.meanstd
nnmnkwii.util.minmax
Files
nnmnkwii.util.example_label_file
nnmnkwii.util.example_audio_file
nnmnkwii.util.example_question_file
nnmnkwii.util.example_file_data_sources_for_duration_model
nnmnkwii.util.example_file_data_sources_for_acoustic_model
nnmnkwii
Docs
»
Evaluation metrics
View page source
Evaluation metrics
¶
Todo
Comming later
melcd
(vec1, vec2)