ttslearn.wavenet¶

WaveNet音声合成のためのモジュールです。

TTS¶

The TTS functionality is accessible from ttslearn.wavenet.*

class ttslearn.wavenet.tts.WaveNetTTS(model_dir=None, device='cpu')[source]¶

WaveNet-based text-to-speech

Parameters

model_dir (str) – model directory. A pre-trained model (ID: wavenettts) is used if None.
device (str) – cpu or cuda.

Examples

>>> from ttslearn.wavenet import WaveNetTTS
>>> engine = WaveNetTTS()
>>> wav, sr = engine.tts("今日もいい天気ですね。")

set_device(device)[source]¶

Set device for the TTS models

Parameters: device (str) – cpu or cuda.

tts(text, tqdm=<class 'tqdm.std.tqdm'>)[source]¶

Run TTS

Parameters

text (str) – Input text
tqdm (object, optional) – tqdm object. Defaults to None.

Returns

audio array (np.int16) and sampling rate (int)

Return type

tuple

Upsampling networks¶

Repeat upsampling¶

class ttslearn.wavenet.upsample.RepeatUpsampling(upsample_scales)[source]¶

Repeat upsampling

Parameters: upsample_scales (list) – list of scales to upsample

forward(c)[source]¶

Forward step

Parameters: c (torch.Tensor) – input features
Returns: upsampled features
Return type: torch.Tensor

Nearest neighbor upsampling¶

class ttslearn.wavenet.upsample.UpsampleNetwork(upsample_scales)[source]¶

Upsample by nearest neighbor

Parameters: upsample_scales (list) – list of scales to upsample

forward(c)[source]¶

Forward step

Parameters: c (torch.Tensor) – input features
Returns: upsampled features
Return type: torch.Tensor

Conv1d + nearest neighbor upsampling¶

class ttslearn.wavenet.upsample.ConvInUpsampleNetwork(upsample_scales, cin_channels, aux_context_window)[source]¶

Conv1d + UpsampleNetwork

Parameters

upsample_scales (list) – list of scales to upsample
cin_channels (int) – number of input channels
aux_context_window (int) – size of the auxiliary context window

forward(c)[source]¶

Forward step

Parameters: c (torch.Tensor) – input features
Returns: upsampled features
Return type: torch.Tensor

Convolution block¶

class ttslearn.wavenet.modules.ResSkipBlock(residual_channels, gate_channels, kernel_size, skip_out_channels, dilation=1, cin_channels=80, *args, **kwargs)[source]¶

Convolution block with residual and skip connections.

Parameters

residual_channels (int) – Residual connection channels.
gate_channels (int) – Gated activation channels.
kernel_size (int) – Kernel size of convolution layers.
skip_out_channels (int) – Skip connection channels.
dilation (int) – Dilation factor.
cin_channels (int) – Local conditioning channels.
args (list) – Additional arguments for Conv1d.
kwargs (dict) – Additional arguments for Conv1d.

forward(x, c)[source]¶

Forward step

Parameters

x (torch.Tensor) – Input signal.
c (torch.Tensor) – Local conditioning signal.

Returns

Tuple of output signal and skip connection signal

Return type

tuple

incremental_forward(x, c)[source]¶

Incremental forward

Parameters

x (torch.Tensor) – Input signal.
c (torch.Tensor) – Local conditioning signal.

Returns

Tuple of output signal and skip connection signal

Return type

tuple

clear_buffer()[source]¶: Clear input buffer.

WaveNet¶

class ttslearn.wavenet.wavenet.WaveNet(out_channels=256, layers=30, stacks=3, residual_channels=64, gate_channels=128, skip_out_channels=64, kernel_size=2, cin_channels=80, upsample_scales=None, aux_context_window=0)[source]¶

Parameters

out_channels (int) – the number of output channels
layers (int) – the number of layers
stacks (int) – the number of residual stacks
residual_channels (int) – the number of residual channels
gate_channels (int) – the number of channels for the gating function
skip_out_channels (int) – the number of channels in the skip output
kernel_size (int) – the size of the convolutional kernel
cin_channels (int) – the number of input channels for local conditioning
upsample_scales (list) – the list of scales to upsample the local conditioning features
aux_context_window (int) – the number of context frames

forward(x, c)[source]¶

Forward step

Parameters

x (torch.Tensor) – the input waveform
c (torch.Tensor) – the local conditioning feature

Returns

the output waveform

Return type

torch.Tensor

inference(c, num_time_steps=100, tqdm=<function WaveNet.<lambda>>)[source]¶

Inference step

Parameters

c (torch.Tensor) – the local conditioning feature
num_time_steps (int) – the number of time steps to generate
tqdm (lambda) – a tqdm function to track progress

Returns

the output waveform

Return type

torch.Tensor

clear_buffer()[source]¶: Clear the internal buffer.

remove_weight_norm_()[source]¶: Remove weight normalization of the model

Generation utility¶

gen_waveform

Generate waveform from WaveNet.

Utility¶

receptive_field_size

Compute receptive field size of WaveNet