pyopenjtalk

The functional interface for text processing and waveform synthesis.

Note

For ease of use, all the functional interfaces use global instances of pyopenjtalk.openjtalk.OpenJTalk and pyopenjtalk.htsengine.HTSEngine internally.

If you want to get a full control (e.g., using an external dictionary or htsvoice), please manually instanciate and use these classes.

High-level API

pyopenjtalk.tts(text, speed=1.0, half_tone=0.0)[source]

Text-to-speech

Parameters
  • text (str) – Input text

  • speed (float) – speech speed rate. Default is 1.0.

  • half_tone (float) – additional half-tone. Default is 0.

Returns

speech waveform (dtype: np.float64) int: sampling frequency (defualt: 48000)

Return type

np.ndarray

pyopenjtalk.g2p(*args, **kwargs)[source]

Grapheme-to-phoeneme (G2P) conversion

This is just a convenient wrapper around run_frontend.

Parameters
  • text (str) – Unicode Japanese text.

  • kana (bool) – If True, returns the pronunciation in katakana, otherwise in phone. Default is False.

  • join (bool) – If True, concatenate phones or katakana’s into a single string. Default is True.

Returns

G2P result in 1) str if join is True 2) list if join is False.

Return type

str or list

pyopenjtalk.extract_fullcontext(text)[source]

Extract full-context labels from text

Parameters

text (str) – Input text

Returns

List of full-context labels

Return type

list

pyopenjtalk.synthesize(labels, speed=1.0, half_tone=0.0)[source]

Run OpenJTalk’s speech synthesis backend

Parameters
  • labels (list) – Full-context labels

  • speed (float) – speech speed rate. Default is 1.0.

  • half_tone (float) – additional half-tone. Default is 0.

Returns

speech waveform (dtype: np.float64) int: sampling frequency (defualt: 48000)

Return type

np.ndarray

Misc

pyopenjtalk.run_frontend(text, verbose=0)[source]

Run OpenJTalk’s text processing frontend

Parameters
  • text (str) – Unicode Japanese text.

  • verbose (int) – Verbosity. Default is 0.

Returns

Pair of 1) NJD_print and 2) JPCommon_make_label. The latter is the full-context labels in HTS-style format.

Return type

tuple