0.1.0
Notes
General design documentation
The underlying design philosophy
Background
Goal
So what do we provide?
Design decisions
Development guidelines
設計ドキュメント (Japanese)
The underlying design philosophy
Background
Goal
So what do we provide?
Design decisions
Development guidelines
A quick start guide
Playing with audio and it’s alignment file
Load wav file
Acoustic features
Load aligment file
Cut silence frames
Linguistic features
Playing with datasets
Get example file sources
Load data
Utterance-wise iteration
Memory cache iteration
Frame-wise iteration
Tutorials
DNN text-to-speech synthesis (en)
Data
Data specification
File data sources
Utterance lengths
How data look like?
Statistics
Combine datasets and normalization.
Model
Train
Configurations
Training loop
Define models
Training Duration model
Training acoustic model
Test
Parameter generation utilities
Listen generated audio
Bidirectional-LSTM based RNNs for text-to-speech synthesis (en)
Data
Data specification
File data sources
Utterance lengths
How data look like?
Statistics
Combine datasets and normalization.
Model
Train
Configurations
Trainining loop
Define models
Training Duration model
Training acoustic model
Test
Parameter generation utilities
Listen generated audio
Bidirectional-LSTM based RNNs for text-to-speech synthesis with OpenJTalk (ja)
Data
Data specification
File data sources
Utterance lengths
How data look like?
Statistics
Combine datasets and normalization.
Model
Train
Configurations
Trainining loop
Define models
Training Duration model
Training acoustic model
Test
Parameter generation utilities
Listen generated audio
TTS using OpenJTalk frontend
GMM-based voice conversion (en)
Data
Data specification
File data sources
Convert dataset to arrays
How data look like?
Align source and target features
How parallel data look like?
Append delta features
Finally, we get joint feature matrix
Model
Visualize model
Means
Covariances
Test
Listen results
How different?
Package references
Autograd
Functional interface
nnmnkwii.autograd.mlpg
nnmnkwii.autograd.unit_variance_mlpg
nnmnkwii.autograd.modspec
Function classes
Baseline
GMM voice conversion
Datasets
Interface
Implementation
Dataset that supports utterance-wise iteration
Dataset that supports frame-wise iteration
Builtin data sources
CMU Arctic (en)
VCTK (en)
LJ-Speech (en)
Voice Conversion Challenge (VCC) 2016 (en)
Voice statistics (ja)
JSUT (ja)
JVS (ja)
Frontend
Merlin frontend
nnmnkwii.frontend.merlin.linguistic_features
nnmnkwii.frontend.merlin.duration_features
Functions
IO
HTS IO
nnmnkwii.io.hts.load
nnmnkwii.io.hts.load_question_set
nnmnkwii.io.hts.write_audacity_labels
nnmnkwii.io.hts.write_textgrid
Evaluation metrics
nnmnkwii.metrics.melcd
nnmnkwii.metrics.mean_squared_error
nnmnkwii.metrics.lf0_mean_squared_error
nnmnkwii.metrics.vuv_error
Parameter generation
nnmnkwii.paramgen.build_win_mats
nnmnkwii.paramgen.mlpg
nnmnkwii.paramgen.mlpg_grad
nnmnkwii.paramgen.unit_variance_mlpg_matrix
nnmnkwii.paramgen.reshape_means
Post-filters
nnmnkwii.postfilters.merlin_post_filter
Pre-processing
Generic
Utterance-wise operations
nnmnkwii.preprocessing.mulaw
nnmnkwii.preprocessing.inv_mulaw
nnmnkwii.preprocessing.mulaw_quantize
nnmnkwii.preprocessing.inv_mulaw_quantize
nnmnkwii.preprocessing.preemphasis
nnmnkwii.preprocessing.inv_preemphasis
nnmnkwii.preprocessing.delta_features
nnmnkwii.preprocessing.trim_zeros_frames
nnmnkwii.preprocessing.remove_zeros_frames
nnmnkwii.preprocessing.adjust_frame_length
nnmnkwii.preprocessing.adjust_frame_lengths
nnmnkwii.preprocessing.scale
nnmnkwii.preprocessing.inv_scale
nnmnkwii.preprocessing.minmax_scale_params
nnmnkwii.preprocessing.minmax_scale
nnmnkwii.preprocessing.inv_minmax_scale
nnmnkwii.preprocessing.modspec
nnmnkwii.preprocessing.inv_modspec
nnmnkwii.preprocessing.modspec_smoothing
Dataset-wise operations
nnmnkwii.preprocessing.meanvar
nnmnkwii.preprocessing.meanstd
nnmnkwii.preprocessing.minmax
F0
nnmnkwii.preprocessing.f0.interp1d
Alignment
Utilities
Function utilities
nnmnkwii.util.apply_each2d_padded
nnmnkwii.util.apply_each2d_trim
Files
nnmnkwii.util.example_label_file
nnmnkwii.util.example_audio_file
nnmnkwii.util.example_question_file
nnmnkwii.util.example_file_data_sources_for_duration_model
nnmnkwii.util.example_file_data_sources_for_acoustic_model
Linear algebra
nnmnkwii.util.linalg.cholesky_inv
nnmnkwii.util.linalg.cholesky_inv_banded
Meta information
Change log
v0.1.0 <2021-08-11>
v0.0.23 <2021-05-15>
v0.0.22 <2020-12-25>
v0.0.21 <2020-08-13>
v0.0.20 <2020-03-02>
v0.0.19 <2019-07-06>
v0.0.18 <2019-05-31>
v0.0.17 <2018-12-25>
v0.0.16 <2018-08-23>
v0.0.15 <2018-07-12>
v0.0.14 <2018-06-06>
v0.0.13 <2018-01-24>
v0.0.12 <2018-01-04>
v0.0.11 <2017-12-22>
v0.0.10 <2017-12-05>
v0.0.9 <2017-11-14>
v0.0.8 <2017-10-25>
v0.0.7 <2017-10-09>
v0.0.6 <2017-10-01>
v0.0.5 <2017-09-19>
v0.0.4 <2017-09-01>
v0.0.3 <2017-08-26>
v0.0.2 <2017-08-18>
v0.0.1 <2017-08-14>
nnmnkwii
»
Utilities
»
nnmnkwii.util.linalg.cholesky_inv_banded
View page source
nnmnkwii.util.linalg.cholesky_inv_banded
¶
nnmnkwii.util.linalg.
cholesky_inv_banded
(
L
,
width
=
3
)
[source]
¶