IO¶
IO operations for some speech-specific file formats.
- HTS-style full-context label file (a.k.a. HTK alignment)
 - HTS-style question file
 
HTS IO¶
load([path, lines]) | 
Load HTS-style label file | 
load_question_set(qs_file_name) | 
Load HTS-style question and convert it to binary/continuous feature extraction regexes. | 
- 
class 
nnmnkwii.io.hts.HTSLabelFile(frame_shift_in_micro_sec=50000)[source]¶ Memory representation for HTS-style context labels (a.k.a HTK alignment).
Indexing is supported. It returns tuple of (
start_time,end_time,label).- 
start_times¶ list – Start times in micro seconds.
- 
end_times¶ list – End times in micro seconds.
- 
contexts¶ list – Contexts. Each value should have either phone or full-context annotation.
Examples
Load from file
>>> from nnmnkwii.io import hts >>> from nnmnkwii.util import example_label_file >>> labels = hts.load(example_label_file()) >>> print(labels[0]) (0, 50000, 'x^x-sil+hh=iy@x_x/A:0_0_0/B:x-x-x@x-x&x-x#x-x$x-x!x-x;x-x|x/C:1+1+2/D:0_0/E:x+x@x+x&x+x#x+x/F:content_1/G:0_0/H:x=x@1=2|0/I:4=3/J:13+9-2[2]')
Create memory representation of label
>>> labels = hts.HTSLabelFile() >>> labels.append((0, 3125000, "silB")) 0 3125000 silB >>> labels.append((3125000, 3525000, "m")) 0 3125000 silB 3125000 3525000 m >>> labels.append((3525000, 4325000, "i")) 0 3125000 silB 3125000 3525000 m 3525000 4325000 i
Save to file
>>> from tempfile import TemporaryFile >>> with TemporaryFile("w") as f: ... f.write(str(labels)) 50
- 
append(label)[source]¶ Append a single alignment label
Parameters: label (tuple) – tuple of (start_time, end_time, context).
Returns: self
Raises: ValueError– if start_time >= end_timeValueError– if last end time doesn’t match start_time
- 
set_durations(durations, frame_shift_in_micro_sec=50000)[source]¶ Set start/end times from duration features
Todo
this should be refactored
- 
silence_frame_indices(regex=None, frame_shift_in_micro_sec=50000)[source]¶ Returns silence frame indices
Similar to
silence_label_indices(), but returns indices in frame-level.Parameters: regex (re(optional)) – Compiled regex to find silence labels. Returns: Silence frame indices Return type: 1darray 
-