Home

WORLD.jl

WORLDModule.

A lightweitht julia wrapper for WORLD, a high-quality speech analysis, manipulation and synthesis system. WORLD provides a way to decompose a speech signal into:

  • Fundamental frequency (F0)
  • spectral envelope
  • aperiodicity

and re-synthesize a speech signal from these paramters. Please see the project page for more details on the WORLD.

Note

WORLD.jl is based on a fork of WORLD (r9y9/World-cmake).

https://github.com/r9y9/WORLD.jl

Usage

In the following examples, suppose x::Vector{Float64} is a input monoral speech signal like:

F0 estimation

Harvest

opt = HarvestOption(71.0, 800.0, period)
f0, timeaxis = harvest(x, fs, opt)

Dio

opt = DioOption(f0floor=71.0, f0ceil=800.0, channels_in_octave=2.0,
        period=period, speed=1)
f0, timeaxis = dio(x, fs, opt)

StoneMask

f0 = stonemask(x, fs, timeaxis, f0)

Spectral envelope estimation by CheapTrick

spectrogram = cheaptrick(x, fs, timeaxis, f0)

Aperiodicity ratio estimation by D4C

aperiodicity = d4c(x, fs, timeaxis, f0)

Synthesis

y = synthesis(f0, spectrogram, aperiodicity, period, fs, length(x))

Compact speech parameterization

Raw spectrum envelope and aperiodicity spectrum are relatively high dimentional (offen more than 513 or 1025) so one might want to get more compact representation. To handle this situation, WORLD provides coding/decoding APIs for spectrum envelope and aperiodicity. Additionally, WORLD.jl provides conversions from spectrum envelope to mel-cepstrum and vice versa. You can choose any of coding/decoding APIs depends on your purpose.

spectrum envelope to mel-cepstrum

mc = sp2mc(spectrogram, order, α) # e.g. order=40, α=0.41

where order is the order of mel-cepstrum (except for 0th) and α is a frequency warping parameter.

mel-cepstrum to spectrum envelope

approximate_spectrogram = mc2sp(mc, α, get_fftsize_for_cheaptrick(fs))

Code aperiodicity

coded_aperiodicity = code_aperiodicity(aperiodicity, fs)

Decode aperiodicity

decoded_aperiodicity = decode_aperiodicity(coded_aperiodicity, fs)

For the complete code of visualizations shown above, please check the IJulia notebook.

Exports

source

Index

Reference

WORLD.cheaptrickMethod.
cheaptrick(x, fs, timeaxis, f0; opt)

CheapTrick calculates the spectrogram that consists of spectral envelopes estimated by CheapTrick.

Parameters

  • x : Input signal
  • fs : Sampling frequency
  • time_axis : Time axis
  • f0 : F0 contour
  • opt : CheapTrick option

Returns

  • spectrogram : Spectrogram estimated by CheapTrick.
source
code_aperiodicity(aperiodicity, fs)
code_aperiodicity(aperiodicity, fs, fftsize)

CodeAperiodicity codes the aperiodicity. The number of dimensions is determined by fs.

Parameters

  • aperiodicity : Aperiodicity before coding
  • fs : Sampling frequency
  • fftsize : FFT size (default : get_fftsize_for_cheaptrick(fs))

Returns

  • coded_aperiodicity : Coded aperiodicity
source
code_spectral_envelope(spectrogram, fs, fftsize, number_of_dimentions)

CodeSpectralEnvelope codes the spectral envelope.

Parameters

  • spectrogram : spectrogram (time sequence of spectral envelope)
  • fs : Sampling frequency
  • fftsize : FFT size
  • number_of_dimentions : Number of dimentions for coded spectral envelope

Returns

  • coded_spectral_envelope : Coded spectral envelope
source
WORLD.d4cMethod.
d4c(x, fs, timeaxis, f0; opt)

D4C calculates the aperiodicity estimated by D4C.

Parameters

  • x : Input signal
  • fs : Sampling frequency
  • time_axis : Time axis
  • f0 : F0 contour

Returns

  • aperiodicity : Aperiodicity estimated by D4C.
source
decode_aperiodicity(coded_aperiodicity, fs)
decode_aperiodicity(coded_aperiodicity, fs, fftsize)

DecodeAperiodicity decoes the coded aperiodicity.

Parameters

  • coded_aperiodicity : Coded aperiodicity
  • fs : Sampling frequency
  • fftsize : FFT size (default : get_fftsize_for_cheaptrick(fs))

Returns

  • aperiodicity : Decoded aperiodicity
source
decode_spectral_envelope(coded_spectral_envelope, fs, fftsize)

DecodeSpectralEnvelope decodes the spectral envelope.

Parameters

  • coded_spectral_envelope : Coded spectral envelope
  • fs : Sampling frequency
  • fftsize : FFT size

Returns

  • spectrogram : decoded spectral envelope
source
WORLD.dioFunction.
dio(x, fs)
dio(x, fs, opt)

Dio estimates F0 trajectory given a monoral input signal.

Paremters

  • x : Input signal
  • fs : Sampling frequency
  • opt : DioOption

Returns

  • time_axis : Temporal positions.
  • f0 : F0 contour.
source
get_fftsize_for_cheaptrick(fs)
get_fftsize_for_cheaptrick(fs, opt)

GetFFTSizeForCheapTrick calculates the FFT size based on the sampling frequency and the lower limit of f0 (It is defined in world.h).

Parameters

  • fs: Sampling frequency
  • opt: CheapTrickOption

Returns

  • fftsize : FFT size
source
get_number_of_aperiodicities(fs)

GetNumberOfAperiodicities provides the number of dimensions for aperiodicity coding. It is determined by only fs.

Parameters

  • fs : Sampleing frequency

Returns

  • n : Number of aperiodicities
source
WORLD.harvestFunction.
harvest(x, fs)
harvest(x, fs, opt)

Harvest estimates F0 trajectory given a monoral input signal.

Paremters

  • x : Input signal
  • fs : Sampling frequency
  • opt : HarvestOption

Returns

  • time_axis : Temporal positions.
  • f0 : F0 contour.
source
WORLD.interp1!Method.
interp1!(x, y, xi, yi)

inplace version of interp1

Parameters

  • x : Input vector (Time axis)
  • y : Values at x[n]
  • xi: Required vector
  • yi : Interpolated vector
source
WORLD.interp1Method.
interp1(x, y, xi)

interp1 interpolates to find yi, the values of the underlying function Y at the points in the vector or array xi. x must be a vector. http://www.mathworks.co.jp/help/techdoc/ref/interp1.html

Parameters

  • x : Input vector (Time axis)
  • y : Values at x[n]
  • xi: Required vector

Returns

  • yi : Interpolated vector
source
WORLD.mc2spMethod.
mc2sp(mc, α, fftlen)

mc2sp converts mel-cepstrum to power spectrum envelope.

$c\_{\alpha}(m) -> |X(\omega)|^{2}$

equivalent: exp(2real(MelGeneralizedCepstrums.mgc2sp(mc, α, 0.0, fftlen))) Note that MelGeneralizedCepstrums.mgc2sp returns log magnitude spectrum.

source
WORLD.sp2mcMethod.
sp2mc(powerspec, order, α; fftlen)

sp2mc converts power spectrum envelope to mel-cepstrum

$|X(\omega)|^{2} -> c\_{\alpha}(m)$

source
WORLD.stonemaskMethod.
stonemask(x, fs, timeaxis, f0)

StoneMask refines the estimated F0 by Dio,

Parameters

  • x : Input signal
  • fs : Sampling frequency
  • time_axis : Temporal information
  • f0 : f0 contour

Returns

  • refined_f0 : Refined F0
source
WORLD.synthesisMethod.
synthesis(f0, spectrogram, aperiodicity, period, fs, len)

Synthesis synthesize the voice based on f0, spectrogram and aperiodicity (not excitation signal.

Parameters

  • f0 : f0 contour
  • spectrogram : Spectrogram estimated by CheapTrick
  • aperiodicity : Aperiodicity spectrogram based on D4C
  • period : Temporal period used for the analysis
  • fs : Sampling frequency
  • len : Length of the output signal

Returns

  • y : Calculated speech
source

CheapTrick options

Fields

  • q1

  • f0floor

  • fftsize

source
WORLD.D4COptionType.

D4C options (nothing for now, but for future changes)

Fields

  • threshold
source
WORLD.DioOptionType.

DioOption represents a set of options that is used in DIO, a fundamental frequency analysis.

Fields

  • f0floor

  • f0ceil

  • channels_in_octave

  • period

    frame period in ms

  • speed

  • allowed_range

    added in v0.2.1-2 (WORLD 0.2.0_2)

source

HarvestOption represents a set of options that is used in Harvest, a fundamental frequency analysis.

Fields

  • f0floor

  • f0ceil

  • period

    frame period in ms

source