Downsampling and interpolation

Functionality for the generation of a set of indices which accurately represent a waveform.

The default implementation is a greedy one, as defined in GreedyDownsamplingTraining.

To provide an alternate method, just subclass DownsamplingTraining.

class DownsamplingTraining(dataset: mlgw_bns.dataset_generation.Dataset, tol: float = 1e-05)[source]

Selection of the downsampling indices.

Parameters

dataset (Dataset) – dataset to which to refer for the generation of training waveforms for the downsampling.
degree (int) – degree for the interpolation. Defaults to 3.
tol (float, optional) – Tolerance for the interpolation error. Defaults to 1e-5.
Default: 1e-05

Resample a function \(y(x)\) from its values at certain points \(y_{ds} = y(x_{ds})\).

Parameters

x_ds (np.ndarray) – Old, sparse \(x\) values.
new_x (np.ndarray) – New \(x\) coordinates at which to evaluate the function.
y_ds (np.ndarrays) – Old, sparse \(y\) values.

Returns

new_y – Function evaluated at the coordinates new_x.

Return type

np.ndarray

abstract train(training_dataset_size: int) → mlgw_bns.data_management.DownsamplingIndices[source]: Calcalate downsampling with a generic algoritm, training on a dataset with a given sizes.

validate_downsampling(training_dataset_size: int, validating_dataset_size: int) → tuple[list[float], list[float]][source]

Check that the downsampling is working by looking at the reconstruction error on a fresh dataset.

Parameters

training_dataset_size (int) – How many waveforms to train the downsampling on.
validating_dataset_size (int) – How many waveforms to validate on.

Returns

Amplitude and phase validation errors; these are reported as \(L_\infty\) errors: the absolute maximum of the difference.

Return type

tuple[list[float], list[float]]

class GreedyDownsamplingTraining(dataset: mlgw_bns.dataset_generation.Dataset, tol: float = 1e-05)[source]

find_indices(x_train: np.ndarray, ys_train: list[np.ndarray], seeds_number: int = 4) → list[int][source]

Greedily downsample y(x) by making sure that the reconstruction error of each of the ys (instances of y(x)) is smaller than tol.

Parameters

x_train (np.ndarray) – x array
ys (np.ndarray) – a list of y arrays
seeds_number (np.ndarray, optional) – number of “seed” indices. Defaults to 4. These are placed as equally spaced along the array. Note: this should always be larger than the degree for the interpolation.
Default: 4

Returns

indices – indices which make the interpolation errors smaller than the tolerance on the training dataset.

Return type

np.ndarray

indices_error(ytrue: np.ndarray, ypred: np.ndarray, current_indices: SortedList) → tuple[list[int], list[float]][source]

Find new indices to add to the sampling.

Parameters

ytrue (np.ndarray) – True values of y.
ypred (np.ndarray) – Predicted values of y through interpolation. The algorithm minimizes the difference abs(y - ypred).
current_indices (SortedList) – Indices to which the algorithm should add.
tol (float) – Tolerance for the reconstruction error — new indices are not added if the reconstruction error is below this value.

Returns

new_indices (list[int]) – Indices to insert among the current ones.
errors (list[float]) – Errors (abs(y - y_pred)) at the points where the algorithm inserted the new indices.

Compute a close-to-optimal set of indices at which to sample waveforms, so that the reconstruction stays below a certain tolerance.

Parameters: training_dataset_size (int) – Number of waveforms to generate and with which to train.
Returns: Indices for amplitude and phase, respectively.
Return type: tuple[list[int], list[int]]