IO¶
IO operations for some speech-specific file formats. As of now, it only supports read operations for:
- HTS-style question file
- HTS-style full-context label file
HTS IO¶
load (path[, frame_shift_in_micro_sec]) |
Load HTS-style label file |
load_question_set (qs_file_name) |
Load HTS-style question and convert it to binary/continuous feature extraction regexes. |
-
class
nnmnkwii.io.hts.
HTSLabelFile
(frame_shift_in_micro_sec=50000)[source]¶ Memory representation for HTS-style context labels file.
Indexing is supported. It returns tuple of (
start_time
,end_time
,label
).-
frame_shift_in_ms
¶ int – Frame shift in micro seconds
-
start_times
¶ ndarray – Start times
-
end_times
¶ ndarray – End times
-
contexts
¶ nadarray – Contexts.
Examples
>>> from nnmnkwii.io import hts >>> from nnmnkwii.util import example_label_file >>> labels = hts.load(example_label_file()) >>> print(labels[0]) (0, 50000, 'x^x-sil+hh=iy@x_x/A:0_0_0/B:x-x-x@x-x&x-x#x-x$x-x!x-x;x-x|x/C:1+1+2/D:0_0/E:x+x@x+x&x+x#x+x/F:content_1/G:0_0/H:x=x@1=2|0/I:4=3/J:13+9-2[2]')
-
set_durations
(durations)[source]¶ Set start/end times from duration features
Todo
this should be refactored
-
silence_frame_indices
(regex=None)[source]¶ Returns silence frame indices
Similar to
silence_label_indices()
, but returns indices in frame-level.Parameters: regex (re(optional)) – Compiled regex to find silence labels. Returns: Silence frame indices Return type: 1darray
-