nnmnkwii.preprocessing.meanvar

nnmnkwii.preprocessing.meanvar(dataset, lengths=None, mean_=0.0, var_=0.0, last_sample_count=0, return_last_sample_count=False)[source]

Mean/variance computation given a iterable dataset

Dataset can have variable length samples. In that cases, you need to explicitly specify lengths for all the samples.

Parameters:
  • dataset (nnmnkwii.datasets.Dataset) – Dataset
  • lengths – (list): Frame lengths for each dataset sample.
  • mean_ (array or scalar) – Initial value for mean vector.
  • var_ (array or scaler) – Initial value for variance vector.
  • last_sample_count (int) – Last sample count. Default is 0. If you set non-default mean_ and var_, you need to set last_sample_count property. Typically this will be the number of time frames ever seen.
  • return_last_sample_count (bool) – Return last_sample_count if True.
Returns:

Mean and variance for each dimention. If

return_last_sample_count is True, returns last_sample_count as well.

Return type:

tuple

Examples

>>> from nnmnkwii.preprocessing import meanvar
>>> from nnmnkwii.util import example_file_data_sources_for_acoustic_model
>>> from nnmnkwii.datasets import FileSourceDataset
>>> X, Y = example_file_data_sources_for_acoustic_model()
>>> X, Y = FileSourceDataset(X), FileSourceDataset(Y)
>>> lengths = [len(y) for y in Y]
>>> data_mean, data_var = meanvar(Y, lengths)