nnmnkwii.preprocessing.f0.interp1d

nnmnkwii.preprocessing.f0.interp1d(f0, kind='slinear')[source]

Coutinuous F0 interpolation from discontinuous F0 trajectory

This function generates continuous f0 from discontinuous f0 trajectory based on scipy.interpolate.interp1d(). This is meant to be used for continuous f0 modeling in statistical speech synthesis (e.g., see [1], [2]).

If kind = 'slinear', then this does same thing as Merlin does.

Parameters
  • f0 (ndarray) – F0 or log-f0 trajectory

  • kind (str) – Kind of interpolation that scipy.interpolate.interp1d() supports. Default is 'slinear', which means linear interpolation.

Returns

Interpolated continuous f0 trajectory.

Return type

1d array (T, ) or 2d (T x 1) array

Examples

>>> from nnmnkwii.preprocessing import interp1d
>>> import numpy as np
>>> from nnmnkwii.util import example_audio_file
>>> from scipy.io import wavfile
>>> import pyworld
>>> fs, x = wavfile.read(example_audio_file())
>>> f0, timeaxis = pyworld.dio(x.astype(np.float64), fs, frame_period=5)
>>> continuous_f0 = interp1d(f0, kind="slinear")
>>> assert f0.shape == continuous_f0.shape
1

Yu, Kai, and Steve Young. “Continuous F0 modeling for HMM based statistical parametric speech synthesis.” IEEE Transactions on Audio, Speech, and Language Processing 19.5 (2011): 1071-1079.

2

Takamichi, Shinnosuke, et al. “The NAIST text-to-speech system for the Blizzard Challenge 2015.” Proc. Blizzard Challenge workshop. 2015.