gwin.io.hdf module

This modules defines functions for reading and writing samples that the inference samplers generate.

class gwin.io.hdf.InferenceFile(path, mode=None, **kwargs)[source]

Bases: h5py._hl.files.File

A subclass of the h5py.File object that has extra functions for handling reading and writing the samples from the samplers.

Parameters:
  • path (str) – The path to the HDF file.
  • mode ({None, str}) – The mode to open the file, eg. “w” for write and “r” for read.
acl

Returns the saved autocorelation length (ACL).

Returns:acl – The ACL.
Return type:{int, float}
burn_in_iterations

Returns number of iterations in the burn in.

cmd

Returns the (last) saved command line.

If the file was created from a run that resumed from a checkpoint, only the last command line used is returned.

Returns:cmd – The command line that created this InferenceFile.
Return type:string
copy(other, parameters=None, parameter_names=None, posterior_only=False, **kwargs)[source]

Copies data in this file to another file.

The samples and stats to copy may be down selected using the given kwargs. All other data (the “metadata”) are copied exactly.

Parameters:
  • other (str or InferenceFile) – The file to write to. May be either a string giving a filename, or an open hdf file. If the former, the file will be opened with the write attribute (note that if a file already exists with that name, it will be deleted).
  • parameters (list of str, optional) – List of parameters to copy. If None, will copy all parameters.
  • parameter_names (dict, optional) – Rename one or more parameters to the given name. The dictionary should map parameter -> parameter name. If None, will just use the original parameter names.
  • posterior_only (bool, optional) – Write the samples and model stats as flattened arrays, and set other’s posterior_only attribute. For example, if this file has a parameter’s samples written to {samples_group}/{param}/walker{x}, then other will have all of the selected samples from all walkers written to {samples_group}/{param}/.
  • **kwargs – All other keyword arguments are passed to read_samples.
Returns:

The open file handler to other.

Return type:

InferenceFile

copy_metadata(other)[source]

Copies all metadata from this file to the other file.

Metadata is defined as all data that is not in either the samples or stats group.

Parameters:other (InferenceFile) – An open inference file to write the data to.
get_slice(thin_start=None, thin_interval=None, thin_end=None)[source]

Formats a slice using the given arguments that can be used to retrieve a thinned array from an InferenceFile.

Parameters:
  • thin_start ({None, int}) – The starting index to use. If None, will try to retrieve the burn_in_iterations from the given file. If no burn_in_iterations exists, will default to the start of the array.
  • thin_interval ({None, int}) – The interval to use. If None, will try to retrieve the acl from the given file. If no acl attribute exists, will default to 1.
  • thin_end ({None, int}) – The end index to use. If None, will retrieve to the end of the array.
Returns:

The slice needed.

Return type:

slice

is_burned_in

Returns whether or not the sampler is burned in.

log_evidence

Returns the log of the evidence and its error, if they exist in the file. Raises a KeyError otherwise.

lognl

Returns the log noise likelihood.

model_name

Returns the name of the model that was used.

n_independent_samples

Returns the number of independent samples stored in the file.

name = 'hdf'
niterations

Returns number of iterations performed.

Returns:niterations – Number of iterations performed.
Return type:int
ntemps

Returns number of temperatures used.

nwalkers

Returns number of walkers used.

Returns:nwalkesr – Number of walkers used.
Return type:int
posterior_only

Whether the file only contains flattened posterior samples.

read_acceptance_fraction(**kwargs)[source]

Returns the acceptance fraction that was written to the file.

Parameters:**kwargs – All keyword arguments are passed to the sampler’s read_acceptance_fraction function.
Returns:The acceptance fraction.
Return type:numpy.array
read_acls()[source]

Returns all of the individual chains’ acls. See the read_acls function of this file’s sampler for more details.

read_label(parameter, error_on_none=False)[source]

Returns the label for the parameter.

Parameters:
  • parameter (str) – Name of parameter to get a label for. Will first try to retrieve a label from this file’s “label” attributes. If the parameter is not found there, will look for a label from pycbc.waveform.parameters.
  • error_on_none ({False, bool}) – If True, will raise a ValueError if a label cannot be found, or if the label is None. Otherwise, the parameter will just be returned if no label can be found.
Returns:

label – A formatted string for the name of the paramter.

Return type:

str

read_model_stats(**kwargs)[source]

Reads model stats from self.

Parameters:**kwargs – The keyword args are passed to the sampler’s read_model_stats method.
Returns:stats – Likelihood stats in the file, as a FieldArray. The fields of the array are the names of the stats that are in the model_stats group.
Return type:{FieldArray, None}
read_random_state(group=None)[source]

Reads the state of the random number generator from the file.

Parameters:group (str) – Name of group to read random state from.
Returns:A tuple with 5 elements that can be passed to numpy.set_state.
Return type:tuple
read_samples(parameters, samples_group=None, **kwargs)[source]

Reads samples from the file.

Parameters:
  • parameters ((list of) strings) – The parameter(s) to retrieve. A parameter can be the name of any field in samples_group, a virtual field or method of FieldArray (as long as the file contains the necessary fields to derive the virtual field or method), and/or a function of these.
  • samples_group (str) – Group in HDF InferenceFile that parameters belong to.
  • **kwargs – The rest of the keyword args are passed to the sampler’s read_samples method.
Returns:

Samples for the given parameters, as an instance of a FieldArray.

Return type:

FieldArray

resume_points

The iterations at which a run was resumed from checkpoint.

Returns:resume_points – An array of integers giving the points at which the run resumed.
Return type:array or None
Raises:KeyError – If the run never resumed from a checkpoint.
sampler_class

Returns the sampler class that was used.

sampler_group = 'sampler_states'
sampler_name

Returns the name of the sampler that was used.

samples_group = 'samples'
samples_parser

Returns the class to use to read/write samples from/to the file.

sampling_params

Returns the parameters that were used to sample.

Returns:sampling_params – List of the sampling params.
Return type:{list, str}
static_params

Returns a dictionary of the static_params. The keys are the argument names, values are the value they were set to.

stats_group = 'model_stats'
variable_params

Returns list of variable_params.

Returns:variable_params – List of str that contain variable_params keys.
Return type:{list, str}
write_command_line()[source]

Writes command line to attributes.

The command line is written to the file’s attrs['cmd']. If this attribute already exists in the file (this can happen when resuming from a checkpoint), attrs['cmd'] will be a list storing the current command line and all previous command lines.

write_data(strain_dict=None, stilde_dict=None, psd_dict=None, low_frequency_cutoff_dict=None, group=None)[source]

Writes the strain/stilde/psd.

Parameters:
  • strain_dict ({None, dict}) – A dictionary of strains. If None, no strain will be written.
  • stilde_dict ({None, dict}) – A dictionary of stilde. If None, no stilde will be written.
  • psd_dict ({None, dict}) – A dictionary of psds. If None, no psds will be written.
  • low_freuency_cutoff_dict ({None, dict}) – A dictionary of low frequency cutoffs used for each detector in psd_dict; must be provided if psd_dict is not None.
  • group ({None, str}) – The group to write the strain to. If None, will write to the top level.
write_injections(injection_file, ifo)[source]

Writes injection parameters for an IFO to file.

Parameters:
  • injection_file (str) – Path to HDF injection file.
  • ifo (str) – IFO name.
write_psd(psds, low_frequency_cutoff, group=None)[source]

Writes PSD for each IFO to file.

Parameters:
  • psds ({dict, FrequencySeries}) – A dict of FrequencySeries where the key is the IFO.
  • low_frequency_cutoff ({dict, float}) – A dict of the low-frequency cutoff where the key is the IFO. The minimum value will be stored as an attr in the File.
  • group ({None, str}) – The group to write the strain to. If None, will write to the top level.
write_random_state(group=None, state=None)[source]

Writes the state of the random number generator from the file.

Parameters:
  • group (str) – Name of group to read random state to.
  • state (tuple, optional) – Specify the random state to write. If None, will use numpy.random.get_state().
write_resume_point()[source]

Keeps a list of the number of iterations that were in a file when a run was resumed from a checkpoint.

write_stilde(stilde_dict, group=None)[source]

Writes stilde for each IFO to file.

Parameters:
  • stilde ({dict, FrequencySeries}) – A dict of FrequencySeries where the key is the IFO.
  • group ({None, str}) – The group to write the strain to. If None, will write to the top level.
write_strain(strain_dict, group=None)[source]

Writes strain for each IFO to file.

Parameters:
  • strain ({dict, FrequencySeries}) – A dict of FrequencySeries where the key is the IFO.
  • group ({None, str}) – The group to write the strain to. If None, will write to the top level.
gwin.io.hdf.check_integrity(filename)[source]

Checks the integrity of an InferenceFile.

Checks done are:

  • can the file open?
  • do all of the datasets in the samples group have the same shape?
  • can the first and last sample in all of the datasets in the samples group be read?

If any of these checks fail, an IOError is raised.

Parameters:

filename (str) – Name of an InferenceFile to check.

Raises:
  • ValueError – If the given file does not exist.
  • KeyError – If the samples group does not exist.
  • IOError – If any of the checks fail.