Hyperparameter Optimization

class HyperparameterOptimization(model: mlgw_bns.model.Model, optimization_seed: int = 42, hyper_validation_fraction: float = 0.1, study: Optional[optuna.study.study.Study] = None)[source]

Manager for the optimization of the hyperparameters corresponding to a certain Model.

The optimization performed is over two variables: the reconstruction accuracy and the training time.

Reconstruction accuracy is quantified by looking at the average square error in the reconstruction of the residuals; spefifically, the average is taken over both amplitude and phase residuals, and the value returned by the objective() function is the base-10 logarithm of this.

Training time accounts for both the time required to train the neural network and the estimated time required to generate the waveforms needed for the training. This can vary, since one of the hyperparameters varied in the training is the number of waveforms in the training dataset.

Including it is convenient since having more waveforms — a finer sampling of the waveform space — means the optimal network might be different.

However, only ever training networks with as large a number of waveforms as we might wish to use in the end gets expensive; therefore, we vary the number of training waveforms in the optimization, so that the optuna study is able to learn the basic region of parameter space which it is best to explore, and then extend that knowledge to the new region of the parameter space with more training waveforms.

The inclusion of this cost term is needed since, typically, using more waveforms will yield a better fit. So, we do multi-parameter optimization: see, for example, Multiobjective tree-structured parzen estimator for computationally expensive optimization problems by Ozaki et al.

To visualize the Pareto front of the optimization, one can use the plot_pareto() method after an optimization run.

Parameters

model (Model) – Reference model for the optimization.
optimization_seed (int, optional) – Seed for the random number to be used in the optimization. Defaults to 42.
Default: 42
hyper_validation_fraction (float, optional) – Fraction of the data to be used in validation during the optimization.
Default: 0.1
study (optuna.Study, optional) – Pre-made study to use. Defaults to None; if not provided, the initializer looks for a file with the correct name in the local directory and uses it, and it creates a new study if it cannot find it.
Default: None
Attributes (Class) –
waveform_gen_time (float) – Reference generation time for a single waveform, to be used in the computation of the effective time in the objective(). Defaults to 0.1.
save_every_n_minutes (float) – When running the optimization through optimize(), every how many minutes to save the study. Defaults to 30.

best_hyperparameters(training_number: Optional[int] = None) → mlgw_bns.neural_network.Hyperparameters[source]

Return the best hyperparameters found using less than a certain number of training waveforms.

Parameters: training_number (int, optional) – Number of training waveforms; by default None, in which case return the hyperparameters for as many waveforms as the current model has available.
Default: None
Return type: Hyperparameters

objective(trial: optuna.Trial) → tuple[float, float][source]

Objective function to be used when optimizing the hyperparameters for the neural network and PCA.

Parameters

trial (optuna.Trial) – This object is required to generate the parameters according to the rules of the :module:optuna optimizer used.

Returns

Base-10 logarithm of the accuracy and training time, respectively.

The accuracy is defined as the average of the square differences between the true and estimated residuals.

The training time includes both the training of the network and, roughly, the generation of the waveforms used for training.

Return type

tuple[float, float]

optimize(timeout_min: float = 5.0) → None[source]

Run the optimization — this is a wrapper around optuna.Study.optimize() — for a certain amount of minutes.

Parameters: timeout_min (float, optional) – Number of minutes to run for, by default 5
Default: 5.0

optimize_and_save(timeout_hr: float = 1.0) → None[source]

Run the optimization — this is a wrapper around optuna.Study.optimize(). This command can take an arbitrary amount of time, therefore its timeout is provided as a parameter. Typically, good results can be achieved within a few hours.

The interval between which to save is determined by the class attribute save_every_n_minutes.

Parameters: timeout_hr (float, optional) – Number of hours to run for, by default 1.
Default: 1.0

plot_pareto() → None[source]: Plot the Pareto front of the bivariate optimization, making use of the function optuna.visualization.plot_pareto_front().

static residuals_difference(residuals_1: mlgw_bns.data_management.Residuals, residuals_2: mlgw_bns.data_management.Residuals) → float[source]

Compare two sets of Residuals.

Parameters

residuals_1 (Residuals) – First set of residuals to be compared.
residuals_2 (Residuals) – Second set of residuals to be compared.

Returns

The average square-difference between the two residual sets.

Return type

float

save_best_trials_to_file(filename: str = 'best_trials') → None[source]

Save the best trials obtained so far in the optimization to the file “filename”.pkl.

The best trials are obtained as self.study.best_trials.

Parameters: filename (str, optional) – Filename to save to, by default “best_trials”
Default: 'best_trials'

property study_filename: str: Name of the file to save the study to.

property training_data_number: int: Number of available training waveforms.