nnmnkwii.frontend.merlin.duration_features

nnmnkwii.frontend.merlin.duration_features(hts_labels, *args, **kwargs)[source]

Duration features from HTS-style full-context labels.

The input full-context must be aligned with phone-level or state-level.

Note

The implementation is adapted from Merlin, but no internal algorithms are changed. Unittests ensure this can get same results with Merlin for several typical settings.

Parameters
  • hts_labels (hts.HTSLabelFile) – HTS label file.

  • feature_type (str) – numerical or binary. Default is numerical.

  • unit_size (str) – phoneme or state. Default for state-level and phone-level alignment is state and phoneme, respectively.

  • feature_size (str) – frame or phoneme. Default is phoneme. frame is only supported for state-level alignments.

  • frame_shift (int) – Frame shift of alignment in 100ns units.

Returns

numpy array representation of duration features.

Return type

numpy.ndarray

Examples

For state-level alignments

>>> from nnmnkwii.frontend import merlin as fe
>>> from nnmnkwii.io import hts
>>> from nnmnkwii.util import example_label_file
>>> labels = hts.load(example_label_file(phone_level=False))
>>> features = fe.duration_features(labels)
>>> features.shape
(40, 5)

For phone-level alignments

>>> from nnmnkwii.frontend import merlin as fe
>>> from nnmnkwii.io import hts
>>> from nnmnkwii.util import example_label_file
>>> labels = hts.load(example_label_file(phone_level=True))
>>> features = fe.duration_features(labels)
>>> features.shape
(40, 1)