Autograd¶
Differenciable functions for PyTorch. This may be extended to support other autograd frameworks.
Currently all functions doesn’t have CUDA implementation, but should be addressed later.
Functional interface¶
mlpg (mean_frames, variance_frames, windows) |
Maximum Liklihood Paramter Generation (MLPG). |
modspec (y[, n, norm]) |
Moduration spectrum computation. |
Function classes¶
-
class
nnmnkwii.autograd.
MLPG
(static_dim, variance_frames, windows)[source]¶ MLPG as an autograd function
f : (T, D) -> (T, static_dim)
.This is meant to be used for Minimum Geneartion Error (MGE) training for speech synthesis and voice conversion. See [R1] for details.
[R1] Wu, Zhizheng, and Simon King. “Minimum trajectory error training for deep neural networks, combined with stacked bottleneck features.” INTERSPEECH. 2015. Let \(d\) is the index of static features, \(l\) is the index of windows, gradients \(g_{d,l}\) can be computed by:
\[g_{d,l} = (\sum_{l} W_{l}^{T}P_{d,l}W_{l})^{-1} W_{l}^{T}P_{d,l}\]where \(W_{l}\) is a banded window matrix and \(P_{d,l}\) is a diagonal precision matrix.
Assuming the variances are diagonals, MLPG can be performed in dimention-by-dimention efficiently.
Let \(o_{d}\) be
T
dimentional back-propagated gradients, the resulting gradients \(g'_{l,d}\) to be propagated are computed as follows:\[g'_{d,l} = o_{d}^{T} g_{d,l}\]-
static_dim
¶ int – number of static dimentions
-
variance_frames
¶ torch.FloatTensor – Variances same as in
nnmnkwii.functions.mlpg()
.
-
windows
¶ list – same as in
nnmnkwii.functions.mlpg()
.
Todo
CUDA implementation
See also
-
-
class
nnmnkwii.autograd.
ModSpec
(n=2048, norm=None)[source]¶ Modulation spectrum computation
f : (T, D) -> (N//2+1, D)
.-
n
¶ int – DFT length.
-
norm
¶ bool – Normalize DFT output or not. See
numpy.fft.fft
.
-