Pre-processing¶
Feature transformation, feature alignment and feature normalization.
Generic¶
Utterance-wise operations¶
|
Mu-Law companding |
|
Inverse of mu-law companding (mu-law expansion) |
|
Mu-Law companding + quantize |
|
Inverse of mu-law companding + quantize |
|
Pre-emphasis |
|
Inverse operation of pre-emphasis |
|
Compute delta features and combine them. |
|
Remove leading and/or trailing zeros frames. |
|
Remove zeros frames. |
|
Adjust frame length given a feature vector or matrix. |
|
Adjust frame lengths given two feature vectors or matrices. |
|
Mean/variance scaling. |
|
Inverse tranform of mean/variance scaling. |
|
Compute parameters required to perform min/max scaling. |
|
Min/max scaling for given a single data. |
|
Inverse transform of min/max scaling for given a single data. |
|
Modulation spectrum (MS) computation |
|
Inverse transform of modulation spectrum computation |
|
Parameter trajectory smoothing by removing high frequency bands of MS. |
F0¶
F0-specific pre-processsing algorithms.
|
Coutinuous F0 interpolation from discontinuous F0 trajectory |
Alignment¶
Alignment algorithms. This is typically useful for creating parallel data in statistical voice conversion.
Currently, there are only high-level APIs that takes input as tuple of
unnormalized padded data arrays (N x T x D)
and returns padded aligned arrays with the same shape. If you are interested
in aligning single pair of feature matrix (not dataset), then use fastdtw
directly instead.
-
class
nnmnkwii.preprocessing.alignment.
DTWAligner
(dist=<function DTWAligner.<lambda>>, radius=1, verbose=0)[source]¶ Align feature matrices using fastdtw.
-
dist
¶ Distance function. Default is
numpy.linalg.norm()
.- Type
function
Examples
>>> from nnmnkwii.util import example_file_data_sources_for_duration_model >>> from nnmnkwii.datasets import FileSourceDataset >>> from nnmnkwii.preprocessing.alignment import DTWAligner >>> _, X = example_file_data_sources_for_duration_model() >>> X = FileSourceDataset(X).asarray() >>> X.shape (3, 40, 5) >>> Y = X.copy() >>> X_aligned, Y_aligned = DTWAligner().transform((X, Y)) >>> X_aligned.shape (3, 40, 5) >>> Y_aligned.shape (3, 40, 5)
-
-
class
nnmnkwii.preprocessing.alignment.
IterativeDTWAligner
(n_iter=3, dist=<function IterativeDTWAligner.<lambda>>, radius=1, max_iter_gmm=100, n_components_gmm=16, verbose=0)[source]¶ Align feature matrices iteratively using GMM-based feature conversion.
-
dist
¶ Distance function
- Type
function
Examples
>>> from nnmnkwii.util import example_file_data_sources_for_duration_model >>> from nnmnkwii.datasets import FileSourceDataset >>> from nnmnkwii.preprocessing.alignment import IterativeDTWAligner >>> _, X = example_file_data_sources_for_duration_model() >>> X = FileSourceDataset(X).asarray() >>> X.shape (3, 40, 5) >>> Y = X.copy() >>> X_aligned, Y_aligned = IterativeDTWAligner(n_iter=1).transform((X, Y)) >>> X_aligned.shape (3, 40, 5) >>> Y_aligned.shape (3, 40, 5)
-