Principal Component Analysis

Functionality for the PCA-decomposition of arbitrary data.

The classes defined here are meant to be lightweight: they do not store the data, instead deferring its management to the higher-level Model class.

class PrincipalComponentTraining(dataset: mlgw_bns.dataset_generation.Dataset, downsampling_indices: mlgw_bns.data_management.DownsamplingIndices, number_of_components: int)[source]

Training and usage of a Principal Component Analysis models.

Parameters
  • dataset (Dataset) – Used to generate the data to be used for training.

  • downsampling_indices

  • number_of_components (int) – Number of components to keep when reducing the dimensionality of the data.

class PrincipalComponentAnalysisModel(number_of_components: int)[source]
fit(data: numpy.ndarray) mlgw_bns.data_management.PrincipalComponentData[source]

Fit the PCA model to this dataset.

Parameters

data (np.ndarray) – Data to fit. Does not need to have zero mean. Should have shape (number_of_datapoints, number_of_dimensions)

Returns

Data describing the trained PCA model.

Return type

PrincipalComponentData

static reconstruct_data(reduced_data: numpy.ndarray, pca_data: mlgw_bns.data_management.PrincipalComponentData) numpy.ndarray[source]

Reconstruct the data.

Parameters
  • reduced_data (np.ndarray) – With shape (number_of_points, number_of_components).

  • pca_data (PrincipalComponentData) – To use in the reconstruction.

Returns

reconstructed_data – With shape (number_of_points, number_of_dimensions).

Return type

np.ndarray

static reduce_data(data: numpy.ndarray, pca_data: mlgw_bns.data_management.PrincipalComponentData) numpy.ndarray[source]

Reduce a dataset to its principal-component representation.

Parameters
  • data (np.ndarray) – With shape (number_of_points, number_of_dimensions).

  • pca_data (PrincipalComponentData) – To use in the reduction.

Returns

reduced_data – With shape (number_of_points, number_of_components).

Return type

np.ndarray