Management of a training and validation dataset
Functionality for the generation of a training dataset.
- class WaveformParameters(mass_ratio: float, lambda_1: float, lambda_2: float, chi_1: float, chi_2: float, dataset: Dataset)[source]
Parameters for a single waveform.
- Parameters
mass_ratio (float) – Mass ratio of the system, \(q = m_1 / m_2\), where \(m_1 \geq m_2\), so \(q \geq 1\).
lambda_1 (float) – Tidal polarizability of the larger star. In papers it is typically denoted as \(\Lambda_1\); for a definition see for example section D of this paper.
lambda_2 (float) – Tidal polarizability of the smaller star.
chi_1 (float) – Aligned dimensionless spin component of the larger star. The dimensionless spin is defined as \(\chi_i = S_i / m_i^2\) in \(c = G = 1\) natural units, where \(S_i\) is the \(z\) component of the dimensionful spin vector. The \(z\) axis is defined as the one which is parallel to the orbital angular momentum of the binary.
chi_2 (float) – Aligned spin component of the smaller star.
dataset (Dataset) – Reference dataset, which includes information required for the generation of the waveform, such as the initial frequency or the reference total mass.
- Class Attributes
number_of_parameters (int) – How many intrinsic parameters are modelled. This class variable should equal the number of other floating-point attributes the class has, it is included for convenience.
- almost_equal_to(other: object)[source]
Check for equality with another set of parameters, accounting for imprecise floats.
- property array: numpy.ndarray
Represent the parameters as a numpy array.
- Returns
Array representation of the parameters, specifically \([q, \Lambda_1, \Lambda_2, \chi_1, \chi_2]\).
- Return type
np.ndarray
- property dlambda
Antisymmetrized tidal deformability parameter \(\delta \widetilde\Lambda\), which gives the next-to-largest contribution to the waveform phase. For the precise definition see equation 27 of this paper.
- property eta
Symmetric mass ratio of the binary.
It is defined as \(\eta = \mu / M\), where \(\mu = (1 / m_1 + 1/ m_2)^{-1}\) and \(M = m_1 + m_2\).
It can also be expressed as \(\eta = m_1 m_2 / (m_1 + m_2)^2 = q / (1+q)^2\), where \(q = m_1 / m_2\) is the mass ratio.
It is also sometimes denoted as \(\nu\). It goes from 0 in the test-mass limit (one mass vanishing) to \(1/4\) in the equal-mass limit.
- property lambdatilde
Symmetrized tidal deformability parameter \(\widetilde\Lambda\), which gives the largest contribution to the waveform phase. For the precise definition see equation 5 of this paper.
- property m_1
Mass of the heavier star in the system, in solar masses.
- property m_2
Mass of the lighter star in the system, in solar masses.
- taylor_f2(frequencies: np.ndarray) dict[str, Union[float, int, np.ndarray]][source]
Parameter dictionary in a format compatible with the custom implemnentation of TaylorF2 implemented within
mlgw_bns.- Parameters
frequencies (np.ndarray) – The frequencies where to compute the waveform, to be given in natural units
- class ParameterSet(parameter_array: numpy.ndarray)[source]
Dataclass which contains an array of parameters for waveform generation.
The meaning of each row of parameters is the same which is described in
WaveformParameters.array().- Parameters
parameter_array (np.ndarray) – Array with shape
(number_of_parameter_tuples, number_of_parameters), wherenumber_of_parameters==5currently.
- classmethod from_parameter_generator(parameter_generator: ParameterGenerator, number_of_parameter_tuples: int)[source]
Make a set of new parameter tuples by randomly generating them with a
ParameterGenerator.- Parameters
parameter_generator (ParameterGenerator) – To generate the tuples.
number_of_parameter_tuples (int) – How many tuples to generate.
- waveform_parameters(dataset: Dataset) list[WaveformParameters][source]
Return a list of WaveformParameters.
- Parameters
dataset (Dataset) – Dataset, required for the initialization of
WaveformParameters.- Return type
Examples
We generate a
ParameterSetwith a single array of parameters ,>>> param_set = ParameterSet(np.array([[1, 2, 3, 4, 5]])) >>> dataset = Dataset(initial_frequency_hz=20., srate_hz=4096.) >>> wp_list = param_set.waveform_parameters(dataset) >>> print(wp_list[0].array) [1 2 3 4 5]
- class Dataset(initial_frequency_hz: float, srate_hz: float, delta_f_hz: typing.Optional[float] = None, waveform_generator: mlgw_bns.dataset_generation.WaveformGenerator = <mlgw_bns.dataset_generation.BarePostNewtonianGenerator object>, parameter_generator_class: typing.Type[mlgw_bns.dataset_generation.ParameterGenerator] = <class 'mlgw_bns.dataset_generation.UniformParameterGenerator'>, parameter_ranges: mlgw_bns.data_management.ParameterRanges = ParameterRanges(mass_range=(2.0, 4.0), q_range=(1.0, 3.0), lambda1_range=(5.0, 5000.0), lambda2_range=(5.0, 5000.0), chi1_range=(-0.5, 0.5), chi2_range=(-0.5, 0.5)), parameter_generator: typing.Optional[mlgw_bns.dataset_generation.ParameterGenerator] = None, seed: int = 42, multibanding: bool = True, f_pivot_hz: float = 40.0)[source]
Metadata for a dataset.
# TODO: the name of this class is misleading, as it contains all information contained to generate the dataset but not the data itself. (but I cannot think of a better one, maybe DatasetMeta?)
The amplitude residuals are defined as \(\log(A _{\text{EOB}} / A_{\text{PN}})\), while the phase residuals are defined as \(\phi _{\text{EOB}} - \phi_{\text{PN}}\).
- Parameters
initial_frequency_hz (float) – Initial frequency from which the waveforms in this dataset should be generated by the effective one body model.
srate_hz (float) – Sampling rate in the time domain. The maximum frequency of the generated time-domain waveforms will be half of this value (see Nyquist frequency).
delta_f_hz (Optional[float], optional) – Frequency spacing for the generated waveforms. If it is not given, it defaults to the one computed through
Dataset.optimal_df_hz().Default:Nonewaveform_generator (WaveformGenerator, optional) – Waveform generator to be used. Defaults to TEOBResumSGenerator, which uses TEOB for the EOB waveform an a TaylorF2 approximant, with 3.5PN-correct amplitude and 5.5PN-correct phase.
Default:<mlgw_bns.dataset_generation.BarePostNewtonianGenerator object>parameter_generator_class (Type[ParameterGenerator], optional) – Parameter generator class to be used. Should be a subclass of ParameterGenerator; the argument is the class as opposed to an instance since the parameter generator needs to reference the dataset and therefure must be created after it. Defaults to UniformParameterGenerator.
Default:<class 'mlgw_bns.dataset_generation.UniformParameterGenerator'>parameter_ranges (ParameterRanges, optional) – Ranges for the parameters to be generated. Defaults to ParameterRanges(), which will use the parameters defined as defaults in that class.
Default:ParameterRanges(mass_range=(2.0, 4.0), q_range=(1.0, 3.0), lambda1_range=(5.0, 5000.0), lambda2_range=(5.0, 5000.0), chi1_range=(-0.5, 0.5), chi2_range=(-0.5, 0.5))parameter_generator (Optional[ParameterGenerator], optional) – Certain parameter generators should not be regenerated each time; if this is the case, then pass the parameter generator here. Defaults to None.
Default:Noneseed (int, optional) – Seed for the random number generator used when generating waveforms for the training. Defaults to 42.
Default:42multibanding (bool, optional) – Whether to use multibanding for the default frequency array. If True, the frequency array is computed according to
reduced_frequency_array(); if False, the frequency array is the “default FFT” one with spacingdelta_f_hz. Defaults to False.Default:Truef_pivot_hz (float, optional) – Pivot frequency for the multibanding in Hz, only used if
multibandingis True. Defaults to 40.Default:40.0Examples
>>> dataset = Dataset(initial_frequency_hz=20., srate_hz=4096.) >>> print(dataset.delta_f_hz) # should be 1/256 Hz, doctest: +NUMBER 0.001953125
- Class Attributes
total_mass (float) – Total mass of the reference binary, in solar masses (class attribute). Defaults to 2.8; this does not typically need to be changed.
- property frequencies: numpy.ndarray
Frequency array corresponding to this dataset, in natural units.
- property frequencies_hz: numpy.ndarray
Frequency array corresponding to this dataset, in Hz.
- generate_residuals(size: int, downsampling_indices: Optional[DownsamplingIndices] = None, flatten_phase: bool = True) tuple[np.ndarray, ParameterSet, Residuals][source]
Generate a set of waveform residuals.
- Parameters
size (int) – Number of waveforms to generate.
downsampling_indices (Optional[DownsamplingIndices], optional) – If provided, return the waveform only at these indices, which can be different between phase and amplitude. Defaults to None.
Default:Noneflatten_phase (bool, optional) – Whether to subtract a linear term from the phase such that it is roughly constant in its first section (through the method
Residuals.flatten_phase()). Defaults to True, but it is always set to False if the downsampling indices are not provided.Default:True- Returns
frequencies (np.ndarray,) – Frequencies at which the waveforms are computed, in natural units. This array should have shape
(number_of_sample_points, ).parameters (ParameterSet)
residuals (Residuals)
- generate_waveforms_from_params(parameters: mlgw_bns.dataset_generation.ParameterSet, downsampling_indices: Optional[mlgw_bns.data_management.DownsamplingIndices] = None) mlgw_bns.data_management.FDWaveforms[source]
Generate full effective-one-body waveforms at each of the parameters in the given parameter set.
- Parameters
parameters (ParameterSet) – Parameters of the waveforms to generate
downsampling_indices (DownsamplingIndices, optional) – Indices to downsample the waveforms at, by default None
Default:None- Return type
- hz_to_natural_units(frequency_hz: Union[float, numpy.ndarray])[source]
Utility function: convert Hz to natural units, using the reference total mass of the dataset.
- Parameters
frequency_hz (Union[float, np.ndarray]) –
- Returns
Frequency in natural units.
- Return type
frequency_nu
- make_parameter_generator(seed: Optional[int] = None) mlgw_bns.dataset_generation.ParameterGenerator[source]
Make a new parameter generator, of the type determined by
parameter_generator_class.- Parameters
seed (int, optional) – Seed for the RNG inside the parameter generator, by default None
Default:None- Return type
ParameterGenerators
- property mass_sum_seconds: float
Reference total mass expressed in seconds, \(GM / c^3\).
- Return type
- mlgw_bns_prefactor(eta: float, total_mass: Optional[float] = None) float[source]
Prefactor by which to multiply the waveform generated by mlgw_bns.
- Parameters
eta (float) – Mass ratio of the binary
total_mass (Optional[float], optional) – Total mass of the binary. Defaults to None, in which case the total_mass attribute of the Dataset will be used.
Default:None- natural_units_to_hz(frequency: Union[float, numpy.ndarray])[source]
Utility function: convert Hz to natural units, using the reference total mass of the dataset.
- Parameters
frequency (Union[float, np.ndarray]) –
- Returns
Frequency in Hz.
- Return type
frequency_hz
- optimal_df_hz(power_of_two: bool = True, margin_percent: float = 8.0) float[source]
Frequency spacing required for the condition \(\Delta f < 1/T\), where \(T\) is the seglen (length of the signal).
The optimal frequency spacing df is the inverse of the seglen (length of the signal) rounded up to the next power of 2.
The seglen has a closed-form expression in the Newtonian limit, see e.g. Maggiore (2007), eq. 4.21:
\(t = 5/256 (\pi f)^{-8/3} (G M \eta^{3/5} / c^3)^{-5/3}\)
The symmetric mass ratio \(\eta\) changes across our dataset, so we take the upper limit with \(\eta = 1/4\).
- Parameters
power_of_two (bool, optional) – whether to return a frequency spacing which is a round power of two. Defaults to True.
Default:Truemargin_percent (float, optional) –
percent of margin to be added to the seglen, so that \(\Delta f < 1 / (T + \delta T)\) holds for \(\delta T \leq T (\text{margin} / 100)\).
Default:8.0This should not be too low, since varying the waveform parameters can perturb the seglen and make it a bit higher than the Newtonian approximation used in this formula.
- Returns
delta_f_hz – Frequency spacing, in Hz.
- Return type
- recompose_residuals(residuals: mlgw_bns.data_management.Residuals, params: mlgw_bns.dataset_generation.ParameterSet, downsampling_indices: Optional[DownsamplingIndices] = None) mlgw_bns.data_management.FDWaveforms[source]
Recompose a set of residuals into true waveforms.
- Parameters
residuals (Residuals) – Residuals to recompose.
params (ParameterSet) – Parameters of the waveforms corresponding to the residuals.
downsampling_indices (DownsamplingIndices, optional) – Indices at which to sample the waveforms. Defaults to None, which means to use the whole sampling
Default:None- Returns
Reconstructed waveforms; these may differ from the original ones by a linear phase term (corresponding to a time shift) even if no manipulation has been done, because of how the
Residualsare stored.- Return type
- taylor_f2_prefactor(eta: float) float[source]
Prefactor by which to multiply the waveform generated by TaylorF2.
- Parameters
eta (float) – Mass ratio of the binary
- class ParameterGenerator(dataset: Dataset, seed: Optional[int] = None, **kwargs)[source]
Generic generator of parameters for new waveforms to be used for training.
- Parameters
dataset (Dataset) – Dataset to which the generated parameters will refer. This parameter is required because the parameters must include things such as the initial frequency, which are properties of the dataset.
seed (Optional[int], optional) – Seed for the random number generator, optional. If it is not given, the
Dataset.seed_sequenceof the dataset is used.Default:None- Class Attributes
number_of_free_parameters (int) – Number of parameter which will vary during the random parameter generation.
- parameter_set_cls
- class UniformParameterGenerator(dataset: mlgw_bns.dataset_generation.Dataset, parameter_ranges: mlgw_bns.data_management.ParameterRanges, seed: Optional[int] = None)[source]
Generator of parameters according to a uniform distribution over their allowed ranges.
# TODO update docs here!
- Parameters
dataset (Dataset) – See the documentation for the initialization of a
ParameterGenerator.parameter_ranges (ParameterRanges) –
seed (Optional[int], optional) – See the documentation for the initialization of a
ParameterGenerator.Default:NoneExamples
>>> generator = UniformParameterGenerator( ... dataset=Dataset(20., 4096.), ... parameter_ranges=ParameterRanges(q_range=(1., 2.))) >>> params = next(generator) >>> print(type(params)) <class 'mlgw_bns.dataset_generation.WaveformParameters'> >>> print(params.mass_ratio) 1.306
- class WaveformGenerator[source]
Generator of theoretical waveforms according to some pre-existing model.
This is an abstract class: users may extend
mlgw_bnsby subclassing this and training a networks using that new waveform generator.This can be accomplished by implementing the methods
post_newtonian_amplitude(),post_newtonian_phase()andeffective_one_body_waveform().Users may wish to leave the Post-Newtonian model already implemented here and only switch TEOBResumS to another waveform template: the easiest way to accomplish this is to subclass
BarePostNewtonianGeneratorand only override itseffective_one_body_waveform()method.- abstract effective_one_body_waveform(params: WaveformParameters, frequencies: Optional[np.ndarray] = None) tuple[np.ndarray, np.ndarray, np.ndarray][source]
Waveform computed according to the comparatively slower effective-one-body method.
- Parameters
params (WaveformParameters) – Parameters of the binary system for which to generate the waveform.
frequencies (np.ndarray, optional) – Frequencies at which to compute the waveform, in natural units. Defaults to None, which means the EOB generator will choose the frequencies at which to compute the waveform.
Default:None- Returns
frequencies (np.ndarray) – Frequencies at which the waveform is given, in natural units: the quantity here is \(Mf\) (with \(G = c = 1\)).
amplitude (np.ndarray) – Amplitude of the plus-polarized waveform. The normalization for the amplitude is the same as discussed in
post_newtonian_amplitude().phase (np.ndarray) – Phase of the plus-polarized waveform, in radians, given as a continuously-varying array (so, not constrained between 0 and 2pi).
- generate_residuals(params: WaveformParameters, frequencies: Optional[np.ndarray] = None, downsampling_indices: Optional[DownsamplingIndices] = None) tuple[np.ndarray, np.ndarray][source]
Compute the residuals of the
effective_one_body_waveform()from the Post-Newtonian one computed withpost_newtonian_amplitude()andpost_newtonian_phase().Residuals are defined as discussed in
Dataset.- Parameters
params (WaveformParameters) – Parameters for which to compute the residuals.
frequencies (np.ndarray, optional) – Frequencies at which to compute the residuals, in natural units. If this parameter is given, the downsampling_indices should index this frequency array. Defaults to None, meaning that the frequencies computed by the
effective_one_body_waveform()method are used.Default:Nonedownsampling_indices (Optional[DownsamplingIndices], optional) – Indices at which to compute the residuals. If not provided (default) the waveform is given at all indices corresponding to the default FFT grid.
Default:None- Returns
Amplitude residuals and phase residuals.
- Return type
tuple[np.ndarray, np.ndarray]
- abstract post_newtonian_amplitude(params: WaveformParameters, frequencies: numpy.ndarray) numpy.ndarray[source]
Amplitude of the Fourier transform of the waveform computed at arbitrary frequencies. This should be implemented in some fast, closed-form way. The speed of the overall model relies on the evaluation of this function not taking too long.
- Parameters
params (WaveformParameters) – Parameters of the binary system for which to generate the waveform.
frequencies (np.ndarray) – Array of frequencies at which to compute the amplitude. Should be given in mass-rescaled natural units; they will be passed to
WaveformParameters.taylor_f2().
- Returns
amplitude – Amplitude of the Fourier transform of the waveform, given with the natural-units convention \(|\widetilde{h}_+(f)| r \eta / M^2\), where we are using \(c= G = 1\) natural units, \(r\) is the distance to the binary, \(\eta\) is the symmetric mass ratio, \(M\) is the total mass of the binary.
- Return type
np.ndarray
- abstract post_newtonian_phase(params: WaveformParameters, frequencies: numpy.ndarray) numpy.ndarray[source]
Phase of the Fourier transform of the waveform computed at arbitrary frequencies. This should be implemented in some fast, closed-form way. The speed of the overall model relies on the evaluation of this not taking too long.
- Parameters
params (WaveformParameters) – Parameters of the binary system for which to generate the waveform.
frequencies (np.ndarray) – Array of frequencies at which to compute the phase. Should be given in mass-rescaled natural units; they will be passed to
WaveformParameters.taylor_f2().
- Returns
phase – Phase of the Fourier transform of the waveform, specifically the phase of the plus polarization in radians. At the \((\ell = 2, m =2)\) multipole, the phase of the cross-polarization will simply by \(\pi/2\) plus this one.
- Return type
np.ndarray
- class TEOBResumSGenerator(eobrun_callable: Callable[[dict], tuple[np.ndarray, ...]])[source]
Generate waveforms using the TEOBResumS effective-one-body code
- effective_one_body_waveform(params: WaveformParameters, frequencies: Optional[np.ndarray] = None) tuple[np.ndarray, np.ndarray, np.ndarray][source]
Generate an EOB waveform with TEOB.
Examples
>>> from EOBRun_module import EOBRunPy >>> tg = TEOBResumSGenerator(EOBRunPy) >>> p = WaveformParameters(1, 300, 300, .3, -.3, Dataset(20., 4096.)) >>> f, amp, phi = tg.effective_one_body_waveform(p)