nnmnkwii.frontend.merlin.linguistic_features

nnmnkwii.frontend.merlin.linguistic_features(hts_labels, *args, **kwargs)[source]

Linguistic features from HTS-style full-context labels.

This converts HTS-style full-context labels to it’s numeric representation given feature extraction regexes which should be constructed from HTS-style question set. The input full-context must be aligned with phone-level or state-level.

Parameters:
  • hts_label (hts.HTSLabelFile) – Input full-context label file
  • binary_dict (dict) – Dictionary used to extract binary features
  • continuous_dict (dict) – Dictionary used to extrract continuous features
  • subphone_features (dict) – Type of sub-phone features we use.
  • add_frame_features (dict) – Whether add frame-level features or not.
Returns:

Numpy array representation of linguistic features.

Return type:

ndarray

Examples

>>> from nnmnkwii.frontend import merlin as fe
>>> from nnmnkwii.io import hts
>>> from nnmnkwii.util import example_label_file, example_question_file
>>> labels = hts.load(example_label_file())
>>> binary_dict, continuous_dict = hts.load_question_set(example_question_file())
>>> features = fe.linguistic_features(labels, binary_dict, continuous_dict)
>>> features.shape
(40, 416)