Downsampling and interpolation

Functionality for the generation of a set of indices which accurately represent a waveform.

The default implementation is a greedy one, as defined in GreedyDownsamplingTraining.

To provide an alternate method, just subclass DownsamplingTraining.

class DownsamplingTraining(dataset: mlgw_bns.dataset_generation.Dataset, tol: float = 1e-05)[source]

Selection of the downsampling indices.

Parameters
  • dataset (Dataset) – dataset to which to refer for the generation of training waveforms for the downsampling.

  • degree (int) – degree for the interpolation. Defaults to 3.

  • tol (float, optional) – Tolerance for the interpolation error. Defaults to 1e-5.

    Default: 1e-05

classmethod resample(x_ds: numpy.ndarray, new_x: numpy.ndarray, y_ds: numpy.ndarray) numpy.ndarray[source]

Resample a function \(y(x)\) from its values at certain points \(y_{ds} = y(x_{ds})\).

Parameters
  • x_ds (np.ndarray) – Old, sparse \(x\) values.

  • new_x (np.ndarray) – New \(x\) coordinates at which to evaluate the function.

  • y_ds (np.ndarrays) – Old, sparse \(y\) values.

Returns

new_y – Function evaluated at the coordinates new_x.

Return type

np.ndarray

abstract train(training_dataset_size: int) mlgw_bns.data_management.DownsamplingIndices[source]

Calcalate downsampling with a generic algoritm, training on a dataset with a given sizes.

validate_downsampling(training_dataset_size: int, validating_dataset_size: int) tuple[list[float], list[float]][source]

Check that the downsampling is working by looking at the reconstruction error on a fresh dataset.

Parameters
  • training_dataset_size (int) – How many waveforms to train the downsampling on.

  • validating_dataset_size (int) – How many waveforms to validate on.

Returns

Amplitude and phase validation errors; these are reported as \(L_\infty\) errors: the absolute maximum of the difference.

Return type

tuple[list[float], list[float]]

class GreedyDownsamplingTraining(dataset: mlgw_bns.dataset_generation.Dataset, tol: float = 1e-05)[source]
find_indices(x_train: np.ndarray, ys_train: list[np.ndarray], seeds_number: int = 4) list[int][source]

Greedily downsample y(x) by making sure that the reconstruction error of each of the ys (instances of y(x)) is smaller than tol.

Parameters
  • x_train (np.ndarray) – x array

  • ys (np.ndarray) – a list of y arrays

  • seeds_number (np.ndarray, optional) – number of “seed” indices. Defaults to 4. These are placed as equally spaced along the array. Note: this should always be larger than the degree for the interpolation.

    Default: 4

Returns

indices – indices which make the interpolation errors smaller than the tolerance on the training dataset.

Return type

np.ndarray

indices_error(ytrue: np.ndarray, ypred: np.ndarray, current_indices: SortedList) tuple[list[int], list[float]][source]

Find new indices to add to the sampling.

Parameters
  • ytrue (np.ndarray) – True values of y.

  • ypred (np.ndarray) – Predicted values of y through interpolation. The algorithm minimizes the difference abs(y - ypred).

  • current_indices (SortedList) – Indices to which the algorithm should add.

  • tol (float) – Tolerance for the reconstruction error — new indices are not added if the reconstruction error is below this value.

Returns

  • new_indices (list[int]) – Indices to insert among the current ones.

  • errors (list[float]) – Errors (abs(y - y_pred)) at the points where the algorithm inserted the new indices.

train(training_dataset_size: int) mlgw_bns.data_management.DownsamplingIndices[source]

Compute a close-to-optimal set of indices at which to sample waveforms, so that the reconstruction stays below a certain tolerance.

Parameters

training_dataset_size (int) – Number of waveforms to generate and with which to train.

Returns

Indices for amplitude and phase, respectively.

Return type

tuple[list[int], list[int]]