Baseline¶

Generic baseline algorithms that can be used as building blocks.

GMM voice conversion¶

class nnmnkwii.baseline.gmm.MLPG(gmm, windows=None, swap=False, diff=False)[source]¶

Maximum likelihood Parameter Generation (MLPG) for GMM-basd voice conversion [1].

Notes

Source speaker’s feature: X = {x_t}, 0 <= t < T
Target speaker’s feature: Y = {y_t}, 0 <= t < T

where T is the number of time frames.

See papar [1] for details.

The code was adapted from https://gist.github.com/r9y9/88bda659c97f46f42525.

Parameters

gmm (sklearn.mixture.GaussianMixture) – Gaussian Mixture Models of source and target joint features.
windows (list) – List of windows. See nnmnkwii.functions.mlpg() for details.
swap (bool) – If True, source -> target, otherwise target -> source.
diff (bool) – Convert GMM -> DIFFGMM if True.

num_mixtures¶

The number of Gaussian mixtures

Type: int

weights¶

shape (num_mixtures), weights for each gaussian

Type: array

src_means¶

shape (num_mixtures, order of spectral feature) means of GMM for a source speaker

Type: array

tgt_means¶

shape (num_mixtures, order of spectral feature) means of GMM for a target speaker

Type: array

covarXX¶

shape (num_mixtures, order of spectral feature, order of spectral feature) variance matrix of source speaker’s spectral feature

Type: array

covarXY¶

shape (num_mixtures, order of spectral feature, order of spectral feature) covariance matrix of source and target speaker’s spectral feature

Type: array

covarYX¶

shape (num_mixtures, order of spectral feature, order of spectral feature) covariance matrix of target and source speaker’s spectral feature

Type: array

covarYY¶

shape (num_mixtures, order of spectral feature, order of spectral feature) variance matrix of target speaker’s spectral feature

Type: array

D¶

shape (num_mixtures, order of spectral feature, order of spectral feature) covariance matrices of target static spectral features

Type: array

px¶

Gaussian Mixture Models of source speaker’s features

Type: sklearn.mixture.GaussianMixture

Examples

>>> from sklearn.mixture import GaussianMixture
>>> from nnmnkwii.baseline.gmm import MLPG
>>> import numpy as np
>>> static_dim, T = 24, 10
>>> windows = [
...     (0, 0, np.array([1.0])),
...     (1, 1, np.array([-0.5, 0.0, 0.5])),
...     (1, 1, np.array([1.0, -2.0, 1.0])),
... ]
>>> src = np.random.rand(T, static_dim * len(windows))
>>> tgt = np.random.rand(T, static_dim * len(windows))
>>> XY = np.concatenate((src, tgt), axis=-1) # pseudo parallel data
>>> gmm = GaussianMixture(n_components=4)
>>> _ = gmm.fit(XY)
>>> paramgen = MLPG(gmm, windows=windows)
>>> generated = paramgen.transform(src)
>>> assert generated.shape == (T, static_dim)