nnmnkwii.preprocessing.meanstd

nnmnkwii.preprocessing.meanstd(dataset, lengths=None, mean_=0.0, var_=0.0, last_sample_count=0, return_last_sample_count=False)[source]

Mean/std-deviation computation given a iterable dataset

Dataset can have variable length samples. In that cases, you need to explicitly specify lengths for all the samples.

Parameters
  • dataset (nnmnkwii.datasets.Dataset) – Dataset

  • lengths – (list): Frame lengths for each dataset sample.

  • mean_ (array or scalar) – Initial value for mean vector.

  • var_ (array or scaler) – Initial value for variance vector.

  • last_sample_count (int) – Last sample count. Default is 0. If you set non-default mean_ and var_, you need to set last_sample_count property. Typically this will be the number of time frames ever seen.

  • return_last_sample_count (bool) – Return last_sample_count if True.

Returns

Mean and variance for each dimention. If

return_last_sample_count is True, returns last_sample_count as well.

Return type

tuple

Examples

>>> from nnmnkwii.preprocessing import meanstd
>>> from nnmnkwii.util import example_file_data_sources_for_acoustic_model
>>> from nnmnkwii.datasets import FileSourceDataset
>>> X, Y = example_file_data_sources_for_acoustic_model()
>>> X, Y = FileSourceDataset(X), FileSourceDataset(Y)
>>> lengths = [len(y) for y in Y]
>>> data_mean, data_std = meanstd(Y, lengths)