Autograd

Differenciable functions for PyTorch. This may be extended to support other autograd frameworks.

Functional interface

mlpg(means, variances, windows)

Maximum Liklihood Paramter Generation (MLPG).

unit_variance_mlpg(R, means)

Special case of MLPG assuming data is normalized to have unit variance.

modspec(y[, n, norm])

Moduration spectrum computation.

Function classes

class nnmnkwii.autograd.MLPG[source]

Generic MLPG as an autograd function.

f : (T, D) -> (T, static_dim).

This is meant to be used for Minimum Geneartion Error (MGE) training for speech synthesis and voice conversion. See [1] and [2] for details.

It relies on nnmnkwii.paramgen.mlpg() and nnmnkwii.paramgen.mlpg_grad() for forward and backward computation, respectively.

1

Wu, Zhizheng, and Simon King. “Minimum trajectory error training for deep neural networks, combined with stacked bottleneck features.” INTERSPEECH. 2015.

2

Xie, Feng-Long, et al. “Sequence error (SE) minimization training of neural network for voice conversion.” Fifteenth Annual Conference of the International Speech Communication Association. 2014.

Parameters

Warning

The function is generic but cannot run on CUDA. For faster differenciable MLPG, see UnitVarianceMLPG.

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, means, variances, windows)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class nnmnkwii.autograd.UnitVarianceMLPG[source]

Special case of MLPG assuming data is normalized to have unit variance.

f : (T x D) -> (T, static_dim). or f : (T*num_windows, static_dim) -> (T, static_dim).

The funtion is theoretically a special case of MLPG. The function assumes input data is noramlized to have unit variance for each dimention. The property of the unit-variance greatly simplifies the backward computation of MLPG.

Let \(\mu\) is the input mean sequence (num_windows*T x static_dim), \(W\) is a window matrix (T x num_windows*T), MLPG can be written as follows:

\[y = R \mu\]

where

\[R = (W^{T} W)^{-1} W^{T}\]

The matrix R can be computed by nnmnkwii.paramgen.unit_variance_mlpg_matrix().

Parameters

R – Unit-variance MLPG matrix of shape (T x num_windows*T). This should be created with nnmnkwii.paramgen.unit_variance_mlpg_matrix().

static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, means, R)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.

class nnmnkwii.autograd.ModSpec[source]

Modulation spectrum computation f : (T, D) -> (N//2+1, D).

Parameters
static backward(ctx, grad_output)[source]

Defines a formula for differentiating the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs did forward() return, and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

static forward(ctx, y, n, norm)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store tensors that can be then retrieved during the backward pass.