nnmnkwii.functions.mlpg¶
-
nnmnkwii.functions.
mlpg
(mean_frames, variance_frames, windows)[source]¶ Maximum Parameter Likelihood Generation (MLPG)
Function
f: (T, D) -> (T, static_dim)
.It peforms Maximum Likelihood Parameter Generation (MLPG) algorithm to generate static features from static + dynamic features over time frames. The implementation was heavily inspired by [R5] and using bandmat for efficient computation.
[R5] M. Shannon, supervised by W. Byrne (2014), Probabilistic acoustic modelling for parametric speech synthesis PhD thesis, University of Cambridge, UK Parameters: - mean_frames (2darray) – The input features (static + delta). In statistical speech synthesis, these are means of gaussian distributions predicted by neural networks or decision trees.
- variance_frames (2d or 1darray) – Variances (static + delta ) of gaussian distributions over time frames (2d) or global variances (1d). If global variances are given, these will get expanded over frames.
- windows (list) – A sequence of
(l, u, win_coeff)
triples, wherel
andu
are non-negative integers specifying the left and right extents of the window and win_coeff is an array specifying the window coefficients.
Returns: Generated static features over time
Examples
>>> from nnmnkwii import functions as F >>> windows = [ ... (0, 0, np.array([1.0])), # static ... (1, 1, np.array([-0.5, 0.0, 0.5])), # delta ... (1, 1, np.array([1.0, -2.0, 1.0])), # delta-delta ... ] >>> T, static_dim = 10, 24 >>> mean_frames = np.random.rand(T, static_dim * len(windows)) >>> variance_frames = np.random.rand(T, static_dim * len(windows)) >>> static_features = F.mlpg(mean_frames, variance_frames, windows) >>> assert static_features.shape == (T, static_dim)
See also