Pre-processing¶

Feature transformation, feature alignment and feature normalization.

Generic¶

Utterance-wise operations¶

`mulaw`(x[, mu])	Mu-Law companding
`inv_mulaw`(y[, mu])	Inverse of mu-law companding (mu-law expansion)
`mulaw_quantize`(x[, mu])	Mu-Law companding + quantize
`inv_mulaw_quantize`(y[, mu])	Inverse of mu-law companding + quantize
`preemphasis`(x[, coef])	Pre-emphasis
`inv_preemphasis`(x[, coef])	Inverse operation of pre-emphasis
`delta_features`(x, windows)	Compute delta features and combine them.
`trim_zeros_frames`(x[, eps])	Remove trailling zeros frames.
`remove_zeros_frames`(x[, eps])	Remove zeros frames.
`adjust_frame_length`(x[, pad, divisible_by])	Adjust frame length given a feature vector or matrix.
`adjust_frame_lengths`(x, y[, pad, …])	Adjust frame lengths given two feature vectors or matrices.
`scale`(x, data_mean, data_std)	Mean/variance scaling.
`inv_scale`(x, data_mean, data_std)	Inverse tranform of mean/variance scaling.
`minmax_scale_params`(data_min, data_max[, …])	Compute parameters required to perform min/max scaling.
`minmax_scale`(x[, data_min, data_max, …])	Min/max scaling for given a single data.
`inv_minmax_scale`(x[, data_min, data_max, …])	Inverse transform of min/max scaling for given a single data.
`modspec`(x[, n, norm, return_phase])	Modulation spectrum (MS) computation
`inv_modspec`(ms, phase[, norm])	Inverse transform of modulation spectrum computation
`modspec_smoothing`(x, modfs[, n, norm, …])	Parameter trajectory smoothing by removing high frequency bands of MS.

Dataset-wise operations¶

`meanvar`(dataset[, lengths, mean_, var_, …])	Mean/variance computation given a iterable dataset
`meanstd`(dataset[, lengths, mean_, var_, …])	Mean/std-deviation computation given a iterable dataset
`minmax`(dataset[, lengths])	Min/max computation given a iterable dataset

F0¶

F0-specific pre-processsing algorithms.

interp1d(f0[, kind]) Coutinuous F0 interpolation from discontinuous F0 trajectory

Alignment¶

Alignment algorithms. This is typically useful for creating parallel data in statistical voice conversion.

Currently, there are only high-level APIs that takes input as tuple of unnormalized padded data arrays (N x T x D) and returns padded aligned arrays with the same shape. If you are interested in aligning single pair of feature matrix (not dataset), then use fastdtw directly instead.

class nnmnkwii.preprocessing.alignment.DTWAligner(dist=<function DTWAligner.<lambda>>, radius=1, verbose=0)[source]¶

Align feature matrices using fastdtw.

dist¶: function – Distance function. Default is numpy.linalg.norm().

radius¶: int – Radius parameter in fastdtw.

verbose¶: int – Verbose flag. Default is 0.

Examples

>>> from nnmnkwii.util import example_file_data_sources_for_duration_model
>>> from nnmnkwii.datasets import FileSourceDataset
>>> from nnmnkwii.preprocessing.alignment import DTWAligner
>>> _, X = example_file_data_sources_for_duration_model()
>>> X = FileSourceDataset(X).asarray()
>>> X.shape
(3, 40, 5)
>>> Y = X.copy()
>>> X_aligned, Y_aligned = DTWAligner().transform((X, Y))
>>> X_aligned.shape
(3, 40, 5)
>>> Y_aligned.shape
(3, 40, 5)

class nnmnkwii.preprocessing.alignment.IterativeDTWAligner(n_iter=3, dist=<function IterativeDTWAligner.<lambda>>, radius=1, max_iter_gmm=100, n_components_gmm=16, verbose=0)[source]¶

Align feature matrices iteratively using GMM-based feature conversion.

n_iter¶: int – Number of iterations.

dist¶: function – Distance function

radius¶: int – Radius parameter in fastdtw.

verbose¶: int – Verbose flag. Default is 0.

max_iter_gmm¶: int – Maximum iteration to train GMM.

n_components_gmm¶: int – Number of mixture components in GMM.

Examples

>>> from nnmnkwii.util import example_file_data_sources_for_duration_model
>>> from nnmnkwii.datasets import FileSourceDataset
>>> from nnmnkwii.preprocessing.alignment import IterativeDTWAligner
>>> _, X = example_file_data_sources_for_duration_model()
>>> X = FileSourceDataset(X).asarray()
>>> X.shape
(3, 40, 5)
>>> Y = X.copy()
>>> X_aligned, Y_aligned = IterativeDTWAligner(n_iter=1).transform((X, Y))
>>> X_aligned.shape
(3, 40, 5)
>>> Y_aligned.shape
(3, 40, 5)