WORLD.jl
WORLD
— Module.A lightweitht julia wrapper for WORLD, a high-quality speech analysis, manipulation and synthesis system. WORLD provides a way to decompose a speech signal into:
- Fundamental frequency (F0)
- spectral envelope
- aperiodicity
and re-synthesize a speech signal from these paramters. Please see the project page for more details on the WORLD.
WORLD.jl is based on a fork of WORLD (r9y9/World-cmake).
https://github.com/r9y9/WORLD.jl
Usage
In the following examples, suppose x::Vector{Float64}
is a input monoral speech signal like:
F0 estimation
Harvest
opt = HarvestOption(71.0, 800.0, period)
f0, timeaxis = harvest(x, fs, opt)
Dio
opt = DioOption(f0floor=71.0, f0ceil=800.0, channels_in_octave=2.0,
period=period, speed=1)
f0, timeaxis = dio(x, fs, opt)
StoneMask
f0 = stonemask(x, fs, timeaxis, f0)
Spectral envelope estimation by CheapTrick
spectrogram = cheaptrick(x, fs, timeaxis, f0)
Aperiodicity ratio estimation by D4C
aperiodicity = d4c(x, fs, timeaxis, f0)
Synthesis
y = synthesis(f0, spectrogram, aperiodicity, period, fs, length(x))
Compact speech parameterization
Raw spectrum envelope and aperiodicity spectrum are relatively high dimentional (offen more than 513 or 1025) so one might want to get more compact representation. To handle this situation, WORLD provides coding/decoding APIs for spectrum envelope and aperiodicity. Additionally, WORLD.jl provides conversions from spectrum envelope to mel-cepstrum and vice versa. You can choose any of coding/decoding APIs depends on your purpose.
spectrum envelope to mel-cepstrum
mc = sp2mc(spectrogram, order, α) # e.g. order=40, α=0.41
where order
is the order of mel-cepstrum (except for 0th) and α is a frequency warping parameter.
mel-cepstrum to spectrum envelope
approximate_spectrogram = mc2sp(mc, α, get_fftsize_for_cheaptrick(fs))
Code aperiodicity
coded_aperiodicity = code_aperiodicity(aperiodicity, fs)
Decode aperiodicity
decoded_aperiodicity = decode_aperiodicity(coded_aperiodicity, fs)
For the complete code of visualizations shown above, please check the IJulia notebook.
Exports
Index
WORLD.cheaptrick
WORLD.code_aperiodicity
WORLD.code_spectral_envelope
WORLD.d4c
WORLD.decode_aperiodicity
WORLD.decode_spectral_envelope
WORLD.dio
WORLD.get_fftsize_for_cheaptrick
WORLD.get_number_of_aperiodicities
WORLD.harvest
WORLD.interp1
WORLD.interp1!
WORLD.mc2sp
WORLD.sp2mc
WORLD.stonemask
WORLD.synthesis
WORLD.CheapTrickOption
WORLD.D4COption
WORLD.DioOption
WORLD.HarvestOption
Reference
WORLD.cheaptrick
— Method.cheaptrick(x, fs, timeaxis, f0; opt)
CheapTrick calculates the spectrogram that consists of spectral envelopes estimated by CheapTrick.
Parameters
x
: Input signalfs
: Sampling frequencytime_axis
: Time axisf0
: F0 contouropt
: CheapTrick option
Returns
spectrogram
: Spectrogram estimated by CheapTrick.
WORLD.code_aperiodicity
— Function.code_aperiodicity(aperiodicity, fs)
code_aperiodicity(aperiodicity, fs, fftsize)
CodeAperiodicity codes the aperiodicity. The number of dimensions is determined by fs.
Parameters
aperiodicity
: Aperiodicity before codingfs
: Sampling frequencyfftsize
: FFT size (default :get_fftsize_for_cheaptrick(fs)
)
Returns
coded_aperiodicity
: Coded aperiodicity
WORLD.code_spectral_envelope
— Method.code_spectral_envelope(spectrogram, fs, fftsize, number_of_dimentions)
CodeSpectralEnvelope codes the spectral envelope.
Parameters
spectrogram
: spectrogram (time sequence of spectral envelope)fs
: Sampling frequencyfftsize
: FFT sizenumber_of_dimentions
: Number of dimentions for coded spectral envelope
Returns
coded_spectral_envelope
: Coded spectral envelope
WORLD.d4c
— Method.d4c(x, fs, timeaxis, f0; opt)
D4C calculates the aperiodicity estimated by D4C.
Parameters
x
: Input signalfs
: Sampling frequencytime_axis
: Time axisf0
: F0 contour
Returns
aperiodicity
: Aperiodicity estimated by D4C.
WORLD.decode_aperiodicity
— Function.decode_aperiodicity(coded_aperiodicity, fs)
decode_aperiodicity(coded_aperiodicity, fs, fftsize)
DecodeAperiodicity decoes the coded aperiodicity.
Parameters
coded_aperiodicity
: Coded aperiodicityfs
: Sampling frequencyfftsize
: FFT size (default :get_fftsize_for_cheaptrick(fs)
)
Returns
aperiodicity
: Decoded aperiodicity
WORLD.decode_spectral_envelope
— Method.decode_spectral_envelope(coded_spectral_envelope, fs, fftsize)
DecodeSpectralEnvelope decodes the spectral envelope.
Parameters
coded_spectral_envelope
: Coded spectral envelopefs
: Sampling frequencyfftsize
: FFT size
Returns
spectrogram
: decoded spectral envelope
WORLD.dio
— Function.dio(x, fs)
dio(x, fs, opt)
Dio estimates F0 trajectory given a monoral input signal.
Paremters
x
: Input signalfs
: Sampling frequencyopt
: DioOption
Returns
time_axis
: Temporal positions.f0
: F0 contour.
WORLD.get_fftsize_for_cheaptrick
— Function.get_fftsize_for_cheaptrick(fs)
get_fftsize_for_cheaptrick(fs, opt)
GetFFTSizeForCheapTrick calculates the FFT size based on the sampling frequency and the lower limit of f0 (It is defined in world.h).
Parameters
fs
: Sampling frequencyopt
: CheapTrickOption
Returns
fftsize
: FFT size
WORLD.get_number_of_aperiodicities
— Method.get_number_of_aperiodicities(fs)
GetNumberOfAperiodicities provides the number of dimensions for aperiodicity coding. It is determined by only fs.
Parameters
fs
: Sampleing frequency
Returns
n
: Number of aperiodicities
WORLD.harvest
— Function.harvest(x, fs)
harvest(x, fs, opt)
Harvest estimates F0 trajectory given a monoral input signal.
Paremters
x
: Input signalfs
: Sampling frequencyopt
: HarvestOption
Returns
time_axis
: Temporal positions.f0
: F0 contour.
WORLD.interp1!
— Method.interp1!(x, y, xi, yi)
inplace version of interp1
Parameters
x
: Input vector (Time axis)y
: Values at x[n]xi
: Required vectoryi
: Interpolated vector
WORLD.interp1
— Method.interp1(x, y, xi)
interp1 interpolates to find yi, the values of the underlying function Y at the points in the vector or array xi. x must be a vector. http://www.mathworks.co.jp/help/techdoc/ref/interp1.html
Parameters
x
: Input vector (Time axis)y
: Values at x[n]xi
: Required vector
Returns
yi
: Interpolated vector
WORLD.mc2sp
— Method.mc2sp(mc, α, fftlen)
mc2sp converts mel-cepstrum to power spectrum envelope.
$c\_{\alpha}(m) -> |X(\omega)|^{2}$
equivalent: exp(2real(MelGeneralizedCepstrums.mgc2sp(mc, α, 0.0, fftlen)))
Note that MelGeneralizedCepstrums.mgc2sp
returns log magnitude spectrum.
WORLD.sp2mc
— Method.sp2mc(powerspec, order, α; fftlen)
sp2mc converts power spectrum envelope to mel-cepstrum
$|X(\omega)|^{2} -> c\_{\alpha}(m)$
WORLD.stonemask
— Method.stonemask(x, fs, timeaxis, f0)
StoneMask refines the estimated F0 by Dio,
Parameters
x
: Input signalfs
: Sampling frequencytime_axis
: Temporal informationf0
: f0 contour
Returns
refined_f0
: Refined F0
WORLD.synthesis
— Method.synthesis(f0, spectrogram, aperiodicity, period, fs, len)
Synthesis synthesize the voice based on f0, spectrogram and aperiodicity (not excitation signal.
Parameters
f0
: f0 contourspectrogram
: Spectrogram estimated by CheapTrickaperiodicity
: Aperiodicity spectrogram based on D4Cperiod
: Temporal period used for the analysisfs
: Sampling frequencylen
: Length of the output signal
Returns
y
: Calculated speech
WORLD.CheapTrickOption
— Type.CheapTrick options
Fields
q1
f0floor
fftsize
WORLD.D4COption
— Type.D4C options (nothing for now, but for future changes)
Fields
threshold
WORLD.DioOption
— Type.DioOption represents a set of options that is used in DIO, a fundamental frequency analysis.
Fields
f0floor
f0ceil
channels_in_octave
period
frame period in ms
speed
allowed_range
added in v0.2.1-2 (WORLD 0.2.0_2)
WORLD.HarvestOption
— Type.HarvestOption represents a set of options that is used in Harvest, a fundamental frequency analysis.
Fields
f0floor
f0ceil
period
frame period in ms