ttslearn.wavenet

WaveNet音声合成のためのモジュールです。

TTS

The TTS functionality is accessible from ttslearn.wavenet.*

class ttslearn.wavenet.tts.WaveNetTTS(model_dir=None, device='cpu')[source]

WaveNet-based text-to-speech

Parameters
  • model_dir (str) – model directory. A pre-trained model (ID: wavenettts) is used if None.

  • device (str) – cpu or cuda.

Examples

>>> from ttslearn.wavenet import WaveNetTTS
>>> engine = WaveNetTTS()
>>> wav, sr = engine.tts("今日もいい天気ですね。")
set_device(device)[source]

Set device for the TTS models

Parameters

device (str) – cpu or cuda.

tts(text, tqdm=<class 'tqdm.std.tqdm'>)[source]

Run TTS

Parameters
  • text (str) – Input text

  • tqdm (object, optional) – tqdm object. Defaults to None.

Returns

audio array (np.int16) and sampling rate (int)

Return type

tuple

Upsampling networks

Repeat upsampling

class ttslearn.wavenet.upsample.RepeatUpsampling(upsample_scales)[source]

Repeat upsampling

Parameters

upsample_scales (list) – list of scales to upsample

forward(c)[source]

Forward step

Parameters

c (torch.Tensor) – input features

Returns

upsampled features

Return type

torch.Tensor

Nearest neighbor upsampling

class ttslearn.wavenet.upsample.UpsampleNetwork(upsample_scales)[source]

Upsample by nearest neighbor

Parameters

upsample_scales (list) – list of scales to upsample

forward(c)[source]

Forward step

Parameters

c (torch.Tensor) – input features

Returns

upsampled features

Return type

torch.Tensor

Conv1d + nearest neighbor upsampling

class ttslearn.wavenet.upsample.ConvInUpsampleNetwork(upsample_scales, cin_channels, aux_context_window)[source]

Conv1d + UpsampleNetwork

Parameters
  • upsample_scales (list) – list of scales to upsample

  • cin_channels (int) – number of input channels

  • aux_context_window (int) – size of the auxiliary context window

forward(c)[source]

Forward step

Parameters

c (torch.Tensor) – input features

Returns

upsampled features

Return type

torch.Tensor

Convolution block

class ttslearn.wavenet.modules.ResSkipBlock(residual_channels, gate_channels, kernel_size, skip_out_channels, dilation=1, cin_channels=80, *args, **kwargs)[source]

Convolution block with residual and skip connections.

Parameters
  • residual_channels (int) – Residual connection channels.

  • gate_channels (int) – Gated activation channels.

  • kernel_size (int) – Kernel size of convolution layers.

  • skip_out_channels (int) – Skip connection channels.

  • dilation (int) – Dilation factor.

  • cin_channels (int) – Local conditioning channels.

  • args (list) – Additional arguments for Conv1d.

  • kwargs (dict) – Additional arguments for Conv1d.

forward(x, c)[source]

Forward step

Parameters
  • x (torch.Tensor) – Input signal.

  • c (torch.Tensor) – Local conditioning signal.

Returns

Tuple of output signal and skip connection signal

Return type

tuple

incremental_forward(x, c)[source]

Incremental forward

Parameters
  • x (torch.Tensor) – Input signal.

  • c (torch.Tensor) – Local conditioning signal.

Returns

Tuple of output signal and skip connection signal

Return type

tuple

clear_buffer()[source]

Clear input buffer.

WaveNet

class ttslearn.wavenet.wavenet.WaveNet(out_channels=256, layers=30, stacks=3, residual_channels=64, gate_channels=128, skip_out_channels=64, kernel_size=2, cin_channels=80, upsample_scales=None, aux_context_window=0)[source]
Parameters
  • out_channels (int) – the number of output channels

  • layers (int) – the number of layers

  • stacks (int) – the number of residual stacks

  • residual_channels (int) – the number of residual channels

  • gate_channels (int) – the number of channels for the gating function

  • skip_out_channels (int) – the number of channels in the skip output

  • kernel_size (int) – the size of the convolutional kernel

  • cin_channels (int) – the number of input channels for local conditioning

  • upsample_scales (list) – the list of scales to upsample the local conditioning features

  • aux_context_window (int) – the number of context frames

forward(x, c)[source]

Forward step

Parameters
  • x (torch.Tensor) – the input waveform

  • c (torch.Tensor) – the local conditioning feature

Returns

the output waveform

Return type

torch.Tensor

inference(c, num_time_steps=100, tqdm=<function WaveNet.<lambda>>)[source]

Inference step

Parameters
  • c (torch.Tensor) – the local conditioning feature

  • num_time_steps (int) – the number of time steps to generate

  • tqdm (lambda) – a tqdm function to track progress

Returns

the output waveform

Return type

torch.Tensor

clear_buffer()[source]

Clear the internal buffer.

remove_weight_norm_()[source]

Remove weight normalization of the model

Generation utility

gen_waveform

Generate waveform from WaveNet.

Utility

receptive_field_size

Compute receptive field size of WaveNet