nnmnkwii.paramgen.mlpg

nnmnkwii.paramgen.mlpg(mean_frames, variance_frames, windows)[source]

Maximum Parameter Likelihood Generation (MLPG)

Function f: (T, D) -> (T, static_dim).

It peforms Maximum Likelihood Parameter Generation (MLPG) algorithm to generate static features from static + dynamic features over time frames dimension-by-dimension.

Let \(\mu\) (T x 1) is the input mean sequence of a particular dimension and \(y\) (T x 1) is the static feature sequence we want to compute, the formula of MLPG is written as:

\[y = A^{-1} b\]

where

\[A = \sum_{l} W_{l}^{T}P_{l}W_{l}\]

,

\[b = P\mu\]

\(W_{l}\) is the l-th window matrix (T x T) and \(P\) (T x T) is the precision matrix which is given by the inverse of variance matrix.

The implementation was heavily inspired by [1] and using bandmat for efficient computation.

1

M. Shannon, supervised by W. Byrne (2014), Probabilistic acoustic modelling for parametric speech synthesis PhD thesis, University of Cambridge, UK

Parameters
  • mean_frames (2darray) – The input features (static + delta). In statistical speech synthesis, these are means of gaussian distributions predicted by neural networks or decision trees.

  • variance_frames (2d or 1darray) – Variances (static + delta ) of gaussian distributions over time frames (2d) or global variances (1d). If global variances are given, these will get expanded over frames.

  • windows (list) – A sequence of (l, u, win_coeff) triples, where l and u are non-negative integers specifying the left and right extents of the window and win_coeff is an array specifying the window coefficients.

Returns

Generated static features over time

Examples

>>> from nnmnkwii import paramgen as G
>>> windows = [
...         (0, 0, np.array([1.0])),            # static
...         (1, 1, np.array([-0.5, 0.0, 0.5])), # delta
...         (1, 1, np.array([1.0, -2.0, 1.0])), # delta-delta
...     ]
>>> T, static_dim = 10, 24
>>> mean_frames = np.random.rand(T, static_dim * len(windows))
>>> variance_frames = np.random.rand(T, static_dim * len(windows))
>>> static_features = G.mlpg(mean_frames, variance_frames, windows)
>>> assert static_features.shape == (T, static_dim)