Hyperparameter Optimization
- class HyperparameterOptimization(model: mlgw_bns.model.Model, optimization_seed: int = 42, hyper_validation_fraction: float = 0.1, study: Optional[optuna.study.study.Study] = None)[source]
Manager for the optimization of the hyperparameters corresponding to a certain
Model.The optimization performed is over two variables: the reconstruction accuracy and the training time.
Reconstruction accuracy is quantified by looking at the average square error in the reconstruction of the residuals; spefifically, the average is taken over both amplitude and phase residuals, and the value returned by the
objective()function is the base-10 logarithm of this.Training time accounts for both the time required to train the neural network and the estimated time required to generate the waveforms needed for the training. This can vary, since one of the hyperparameters varied in the training is the number of waveforms in the training dataset.
Including it is convenient since having more waveforms — a finer sampling of the waveform space — means the optimal network might be different.
However, only ever training networks with as large a number of waveforms as we might wish to use in the end gets expensive; therefore, we vary the number of training waveforms in the optimization, so that the optuna study is able to learn the basic region of parameter space which it is best to explore, and then extend that knowledge to the new region of the parameter space with more training waveforms.
The inclusion of this cost term is needed since, typically, using more waveforms will yield a better fit. So, we do multi-parameter optimization: see, for example, Multiobjective tree-structured parzen estimator for computationally expensive optimization problems by Ozaki et al.
To visualize the Pareto front of the optimization, one can use the
plot_pareto()method after an optimization run.- Parameters
model (Model) – Reference model for the optimization.
optimization_seed (int, optional) – Seed for the random number to be used in the optimization. Defaults to 42.
Default:42hyper_validation_fraction (float, optional) – Fraction of the data to be used in validation during the optimization.
Default:0.1study (optuna.Study, optional) – Pre-made study to use. Defaults to None; if not provided, the initializer looks for a file with the correct name in the local directory and uses it, and it creates a new study if it cannot find it.
Default:NoneAttributes (Class) –
waveform_gen_time (float) – Reference generation time for a single waveform, to be used in the computation of the effective time in the
objective(). Defaults to 0.1.save_every_n_minutes (float) – When running the optimization through
optimize(), every how many minutes to save the study. Defaults to 30.- best_hyperparameters(training_number: Optional[int] = None) mlgw_bns.neural_network.Hyperparameters[source]
Return the best hyperparameters found using less than a certain number of training waveforms.
- Parameters
training_number (int, optional) – Number of training waveforms; by default None, in which case return the hyperparameters for as many waveforms as the current model has available.
Default:None- Return type
- objective(trial: optuna.Trial) tuple[float, float][source]
Objective function to be used when optimizing the hyperparameters for the neural network and PCA.
- Parameters
trial (optuna.Trial) – This object is required to generate the parameters according to the rules of the :module:
optunaoptimizer used.- Returns
Base-10 logarithm of the accuracy and training time, respectively.
The accuracy is defined as the average of the square differences between the true and estimated residuals.
The training time includes both the training of the network and, roughly, the generation of the waveforms used for training.
- Return type
- optimize(timeout_min: float = 5.0) None[source]
Run the optimization — this is a wrapper around
optuna.Study.optimize()— for a certain amount of minutes.- Parameters
timeout_min (float, optional) – Number of minutes to run for, by default 5
Default:5.0- optimize_and_save(timeout_hr: float = 1.0) None[source]
Run the optimization — this is a wrapper around
optuna.Study.optimize(). This command can take an arbitrary amount of time, therefore its timeout is provided as a parameter. Typically, good results can be achieved within a few hours.The interval between which to save is determined by the class attribute
save_every_n_minutes.- Parameters
timeout_hr (float, optional) – Number of hours to run for, by default 1.
Default:1.0- plot_pareto() None[source]
Plot the Pareto front of the bivariate optimization, making use of the function
optuna.visualization.plot_pareto_front().
- static residuals_difference(residuals_1: mlgw_bns.data_management.Residuals, residuals_2: mlgw_bns.data_management.Residuals) float[source]
Compare two sets of
Residuals.