Pre-processing

Feature transformation, feature alignment and feature normalization.

Generic

Utterance-wise operations

mulaw(x[, mu])

Mu-Law companding

inv_mulaw(y[, mu])

Inverse of mu-law companding (mu-law expansion)

mulaw_quantize(x[, mu])

Mu-Law companding + quantize

inv_mulaw_quantize(y[, mu])

Inverse of mu-law companding + quantize

preemphasis(x[, coef])

Pre-emphasis

inv_preemphasis(x[, coef])

Inverse operation of pre-emphasis

delta_features(x, windows)

Compute delta features and combine them.

trim_zeros_frames(x[, eps, trim])

Remove leading and/or trailing zeros frames.

remove_zeros_frames(x[, eps])

Remove zeros frames.

adjust_frame_length(x[, pad, divisible_by])

Adjust frame length given a feature vector or matrix.

adjust_frame_lengths(x, y[, pad, …])

Adjust frame lengths given two feature vectors or matrices.

scale(x, data_mean, data_std)

Mean/variance scaling.

inv_scale(x, data_mean, data_std)

Inverse tranform of mean/variance scaling.

minmax_scale_params(data_min, data_max[, …])

Compute parameters required to perform min/max scaling.

minmax_scale(x[, data_min, data_max, …])

Min/max scaling for given a single data.

inv_minmax_scale(x[, data_min, data_max, …])

Inverse transform of min/max scaling for given a single data.

modspec(x[, n, norm, return_phase])

Modulation spectrum (MS) computation

inv_modspec(ms, phase[, norm])

Inverse transform of modulation spectrum computation

modspec_smoothing(x, modfs[, n, norm, …])

Parameter trajectory smoothing by removing high frequency bands of MS.

Dataset-wise operations

meanvar(dataset[, lengths, mean_, var_, …])

Mean/variance computation given a iterable dataset

meanstd(dataset[, lengths, mean_, var_, …])

Mean/std-deviation computation given a iterable dataset

minmax(dataset[, lengths])

Min/max computation given a iterable dataset

F0

F0-specific pre-processsing algorithms.

interp1d(f0[, kind])

Coutinuous F0 interpolation from discontinuous F0 trajectory

Alignment

Alignment algorithms. This is typically useful for creating parallel data in statistical voice conversion.

Currently, there are only high-level APIs that takes input as tuple of unnormalized padded data arrays (N x T x D) and returns padded aligned arrays with the same shape. If you are interested in aligning single pair of feature matrix (not dataset), then use fastdtw directly instead.

class nnmnkwii.preprocessing.alignment.DTWAligner(dist=<function DTWAligner.<lambda>>, radius=1, verbose=0)[source]

Align feature matrices using fastdtw.

dist

Distance function. Default is numpy.linalg.norm().

Type

function

radius

Radius parameter in fastdtw.

Type

int

verbose

Verbose flag. Default is 0.

Type

int

Examples

>>> from nnmnkwii.util import example_file_data_sources_for_duration_model
>>> from nnmnkwii.datasets import FileSourceDataset
>>> from nnmnkwii.preprocessing.alignment import DTWAligner
>>> _, X = example_file_data_sources_for_duration_model()
>>> X = FileSourceDataset(X).asarray()
>>> X.shape
(3, 40, 5)
>>> Y = X.copy()
>>> X_aligned, Y_aligned = DTWAligner().transform((X, Y))
>>> X_aligned.shape
(3, 40, 5)
>>> Y_aligned.shape
(3, 40, 5)
class nnmnkwii.preprocessing.alignment.IterativeDTWAligner(n_iter=3, dist=<function IterativeDTWAligner.<lambda>>, radius=1, max_iter_gmm=100, n_components_gmm=16, verbose=0)[source]

Align feature matrices iteratively using GMM-based feature conversion.

n_iter

Number of iterations.

Type

int

dist

Distance function

Type

function

radius

Radius parameter in fastdtw.

Type

int

verbose

Verbose flag. Default is 0.

Type

int

max_iter_gmm

Maximum iteration to train GMM.

Type

int

n_components_gmm

Number of mixture components in GMM.

Type

int

Examples

>>> from nnmnkwii.util import example_file_data_sources_for_duration_model
>>> from nnmnkwii.datasets import FileSourceDataset
>>> from nnmnkwii.preprocessing.alignment import IterativeDTWAligner
>>> _, X = example_file_data_sources_for_duration_model()
>>> X = FileSourceDataset(X).asarray()
>>> X.shape
(3, 40, 5)
>>> Y = X.copy()
>>> X_aligned, Y_aligned = IterativeDTWAligner(n_iter=1).transform((X, Y))
>>> X_aligned.shape
(3, 40, 5)
>>> Y_aligned.shape
(3, 40, 5)