Logo
0.1.0

Notes

  • General design documentation
    • The underlying design philosophy
    • Background
    • Goal
    • So what do we provide?
    • Design decisions
    • Development guidelines
  • 設計ドキュメント (Japanese)
    • The underlying design philosophy
    • Background
    • Goal
    • So what do we provide?
    • Design decisions
    • Development guidelines
  • A quick start guide
    • Playing with audio and it’s alignment file
      • Load wav file
      • Acoustic features
      • Load aligment file
      • Cut silence frames
      • Linguistic features
    • Playing with datasets
      • Get example file sources
      • Load data
      • Utterance-wise iteration
        • Memory cache iteration
      • Frame-wise iteration

Tutorials

  • DNN text-to-speech synthesis (en)
    • Data
      • Data specification
      • File data sources
      • Utterance lengths
      • How data look like?
      • Statistics
      • Combine datasets and normalization.
    • Model
    • Train
      • Configurations
      • Training loop
      • Define models
      • Training Duration model
      • Training acoustic model
    • Test
      • Parameter generation utilities
      • Listen generated audio
  • Bidirectional-LSTM based RNNs for text-to-speech synthesis (en)
    • Data
      • Data specification
      • File data sources
      • Utterance lengths
      • How data look like?
      • Statistics
      • Combine datasets and normalization.
    • Model
    • Train
      • Configurations
      • Trainining loop
      • Define models
      • Training Duration model
      • Training acoustic model
    • Test
      • Parameter generation utilities
      • Listen generated audio
  • Bidirectional-LSTM based RNNs for text-to-speech synthesis with OpenJTalk (ja)
    • Data
      • Data specification
      • File data sources
      • Utterance lengths
      • How data look like?
      • Statistics
      • Combine datasets and normalization.
    • Model
    • Train
      • Configurations
      • Trainining loop
      • Define models
      • Training Duration model
      • Training acoustic model
    • Test
      • Parameter generation utilities
      • Listen generated audio
      • TTS using OpenJTalk frontend
  • GMM-based voice conversion (en)
    • Data
      • Data specification
      • File data sources
      • Convert dataset to arrays
      • How data look like?
      • Align source and target features
      • How parallel data look like?
      • Append delta features
      • Finally, we get joint feature matrix
    • Model
      • Visualize model
        • Means
        • Covariances
    • Test
      • Listen results
      • How different?

Package references

  • Autograd
    • Functional interface
      • nnmnkwii.autograd.mlpg
      • nnmnkwii.autograd.unit_variance_mlpg
      • nnmnkwii.autograd.modspec
    • Function classes
  • Baseline
    • GMM voice conversion
  • Datasets
    • Interface
    • Implementation
      • Dataset that supports utterance-wise iteration
      • Dataset that supports frame-wise iteration
    • Builtin data sources
    • CMU Arctic (en)
    • VCTK (en)
    • LJ-Speech (en)
    • Voice Conversion Challenge (VCC) 2016 (en)
    • Voice statistics (ja)
    • JSUT (ja)
    • JVS (ja)
  • Frontend
    • Merlin frontend
      • nnmnkwii.frontend.merlin.linguistic_features
      • nnmnkwii.frontend.merlin.duration_features
  • Functions
  • IO
    • HTS IO
      • nnmnkwii.io.hts.load
      • nnmnkwii.io.hts.load_question_set
      • nnmnkwii.io.hts.write_audacity_labels
      • nnmnkwii.io.hts.write_textgrid
  • Evaluation metrics
    • nnmnkwii.metrics.melcd
    • nnmnkwii.metrics.mean_squared_error
    • nnmnkwii.metrics.lf0_mean_squared_error
    • nnmnkwii.metrics.vuv_error
  • Parameter generation
    • nnmnkwii.paramgen.build_win_mats
    • nnmnkwii.paramgen.mlpg
    • nnmnkwii.paramgen.mlpg_grad
    • nnmnkwii.paramgen.unit_variance_mlpg_matrix
    • nnmnkwii.paramgen.reshape_means
  • Post-filters
    • nnmnkwii.postfilters.merlin_post_filter
  • Pre-processing
    • Generic
      • Utterance-wise operations
        • nnmnkwii.preprocessing.mulaw
        • nnmnkwii.preprocessing.inv_mulaw
        • nnmnkwii.preprocessing.mulaw_quantize
        • nnmnkwii.preprocessing.inv_mulaw_quantize
        • nnmnkwii.preprocessing.preemphasis
        • nnmnkwii.preprocessing.inv_preemphasis
        • nnmnkwii.preprocessing.delta_features
        • nnmnkwii.preprocessing.trim_zeros_frames
        • nnmnkwii.preprocessing.remove_zeros_frames
        • nnmnkwii.preprocessing.adjust_frame_length
        • nnmnkwii.preprocessing.adjust_frame_lengths
        • nnmnkwii.preprocessing.scale
        • nnmnkwii.preprocessing.inv_scale
        • nnmnkwii.preprocessing.minmax_scale_params
        • nnmnkwii.preprocessing.minmax_scale
        • nnmnkwii.preprocessing.inv_minmax_scale
        • nnmnkwii.preprocessing.modspec
        • nnmnkwii.preprocessing.inv_modspec
        • nnmnkwii.preprocessing.modspec_smoothing
      • Dataset-wise operations
        • nnmnkwii.preprocessing.meanvar
        • nnmnkwii.preprocessing.meanstd
        • nnmnkwii.preprocessing.minmax
    • F0
      • nnmnkwii.preprocessing.f0.interp1d
    • Alignment
  • Utilities
    • Function utilities
      • nnmnkwii.util.apply_each2d_padded
      • nnmnkwii.util.apply_each2d_trim
    • Files
      • nnmnkwii.util.example_label_file
      • nnmnkwii.util.example_audio_file
      • nnmnkwii.util.example_question_file
      • nnmnkwii.util.example_file_data_sources_for_duration_model
      • nnmnkwii.util.example_file_data_sources_for_acoustic_model
    • Linear algebra
      • nnmnkwii.util.linalg.cholesky_inv
      • nnmnkwii.util.linalg.cholesky_inv_banded

Meta information

  • Change log
    • v0.1.0 <2021-08-11>
    • v0.0.23 <2021-05-15>
    • v0.0.22 <2020-12-25>
    • v0.0.21 <2020-08-13>
    • v0.0.20 <2020-03-02>
    • v0.0.19 <2019-07-06>
    • v0.0.18 <2019-05-31>
    • v0.0.17 <2018-12-25>
    • v0.0.16 <2018-08-23>
    • v0.0.15 <2018-07-12>
    • v0.0.14 <2018-06-06>
    • v0.0.13 <2018-01-24>
    • v0.0.12 <2018-01-04>
    • v0.0.11 <2017-12-22>
    • v0.0.10 <2017-12-05>
    • v0.0.9 <2017-11-14>
    • v0.0.8 <2017-10-25>
    • v0.0.7 <2017-10-09>
    • v0.0.6 <2017-10-01>
    • v0.0.5 <2017-09-19>
    • v0.0.4 <2017-09-01>
    • v0.0.3 <2017-08-26>
    • v0.0.2 <2017-08-18>
    • v0.0.1 <2017-08-14>
nnmnkwii
  • »
  • Python Module Index

Python Module Index

n
 
n
- nnmnkwii
    nnmnkwii.autograd
    nnmnkwii.baseline.gmm
    nnmnkwii.datasets
    nnmnkwii.frontend.merlin
    nnmnkwii.io.hts
    nnmnkwii.metrics
    nnmnkwii.paramgen
    nnmnkwii.postfilters
    nnmnkwii.preprocessing
    nnmnkwii.preprocessing.alignment
    nnmnkwii.preprocessing.f0
    nnmnkwii.util
    nnmnkwii.util.linalg

© Copyright 2017, Ryuichi Yamamoto.

Built with Sphinx using a theme provided by Read the Docs.