nnmnkwii.frontend.merlin.linguistic_features¶
-
nnmnkwii.frontend.merlin.
linguistic_features
(hts_labels, *args, **kwargs)[source]¶ Linguistic features from HTS-style full-context labels.
This converts HTS-style full-context labels to it’s numeric representation given feature extraction regexes which should be constructed from HTS-style question set. The input full-context must be aligned with phone-level or state-level.
Note
The implementation is adapted from Merlin, but no internal algorithms are changed. Unittests ensure this can get same results with Merlin for several typical settings.
- Parameters
hts_label (hts.HTSLabelFile) – Input full-context label file
binary_dict (dict) – Dictionary used to extract binary features
numeric_dict (dict) – Dictionary used to extrract continuous features
subphone_features (dict) – Type of sub-phone features. According to the Merlin’s source code, None,
full
,state_only
,frame_only
,uniform_state
,minimal_phoneme
andcoarse_coding
are supported. However, None,full
(for state alignment) andcoarse_coding
(phone alignment) are only tested in this library. Default is None.add_frame_features (dict) – Whether add frame-level features or not. Default is False.
frame_shift (int) – Frame shift of alignment in 100ns units.
- Returns
Numpy array representation of linguistic features.
- Return type
Examples
For state-level labels
>>> from nnmnkwii.frontend import merlin as fe >>> from nnmnkwii.io import hts >>> from nnmnkwii.util import example_label_file, example_question_file >>> labels = hts.load(example_label_file(phone_level=False)) >>> binary_dict, numeric_dict = hts.load_question_set(example_question_file()) >>> features = fe.linguistic_features(labels, binary_dict, numeric_dict, ... subphone_features="full", add_frame_features=True) >>> features.shape (615, 425) >>> features = fe.linguistic_features(labels, binary_dict, numeric_dict, ... subphone_features=None, add_frame_features=False) >>> features.shape (40, 416)
For phone-level labels
>>> from nnmnkwii.frontend import merlin as fe >>> from nnmnkwii.io import hts >>> from nnmnkwii.util import example_label_file, example_question_file >>> labels = hts.load(example_label_file(phone_level=True)) >>> binary_dict, numeric_dict = hts.load_question_set(example_question_file()) >>> features = fe.linguistic_features(labels, binary_dict, numeric_dict, ... subphone_features="coarse_coding", add_frame_features=True) >>> features.shape (615, 420) >>> features = fe.linguistic_features(labels, binary_dict, numeric_dict, ... subphone_features=None, add_frame_features=False) >>> features.shape (40, 416)