gwin.sampler.emcee module¶

This modules provides classes and functions for using the emcee sampler packages for parameter estimation.

class gwin.sampler.emcee.EmceeEnsembleSampler(model, nwalkers, pool=None, model_call=None)[source]¶

Bases: gwin.sampler.base.BaseMCMCSampler

This class is used to construct an MCMC sampler from the emcee package’s EnsembleSampler.

Parameters:	model (model) – A model from `gwin.models`. nwalkers (int) – Number of walkers to use in sampler. pool (function with map, Optional) – A provider of a map function that allows a function call to be run over multiple sets of arguments and possibly maps them to cores/nodes/etc.

chain¶: Get all past samples as an nwalker x niterations x ndim array.

clear_chain()[source]¶: Clears the chain and blobs from memory.

classmethod from_cli(opts, model, pool=None, model_call=None)[source]¶

Create an instance of this sampler from the given command-line options.

Parameters:	opts (ArgumentParser options) – The options to parse. model (LikelihoodEvaluator) – The model to use with the sampler.
Returns:	An emcee sampler initialized based on the given arguments.
Return type:	EmceeEnsembleSampler

lnpost¶: Get the natural logarithm of the likelihood as an nwalkers x niterations array.

name = 'emcee'¶

run(niterations, **kwargs)[source]¶

Advance the ensemble for a number of samples.

Parameters:	niterations (int) – Number of samples to get from sampler.
Returns:	p (numpy.array) – An array of current walker positions with shape (nwalkers, ndim). lnpost (numpy.array) – The list of log posterior probabilities for the walkers at positions p, with shape (nwalkers, ndim). rstate – The current state of the random number generator.

set_p0(samples_file=None, prior=None)[source]¶

Sets the initial position of the walkers.

Parameters:	samples_file (InferenceFile, optional) – If provided, use the last iteration in the given file for the starting positions. prior (JointDistribution, optional) – Use the given prior to set the initial positions rather than `model`’s prior.
Returns:	p0 – An nwalkers x ndim array of the initial positions that were set.
Return type:	array

set_state_from_file(fp)[source]¶: Sets the state of the sampler back to the instance saved in a file.

write_results(fp, start_iteration=None, max_iterations=None, **metadata)[source]¶

Writes metadata, samples, model stats, and acceptance fraction to the given file. See the write function for each of those for details.

Parameters:

fp (InferenceFile) – A file handler to an open inference file.
start_iteration (int, optional) – Write results to the file’s datasets starting at the given iteration. Default is to append after the last iteration in the file.
max_iterations (int, optional) – Set the maximum size that the arrays in the hdf file may be resized to. Only applies if the samples have not previously been written to file. The default (None) is to use the maximum size allowed by h5py.
**metadata – All other keyword arguments are passed to write_metadata.

write_state(fp)[source]¶: Saves the state of the sampler in a file.

class gwin.sampler.emcee.EmceePTSampler(model, ntemps, nwalkers, pool=None, model_call=None)[source]¶

Bases: gwin.sampler.base.BaseMCMCSampler

This class is used to construct a parallel-tempered MCMC sampler from the emcee package’s PTSampler.

Parameters:	model (model) – A model from `gwin.models`. ntemps (int) – Number of temeratures to use in the sampler. nwalkers (int) – Number of walkers to use in sampler. pool (function with map, Optional) – A provider of a map function that allows a function call to be run over multiple sets of arguments and possibly maps them to cores/nodes/etc.

classmethod calculate_logevidence(fp, thin_start=None, thin_end=None, thin_interval=None)[source]¶

Calculates the log evidence from the given file using emcee’s thermodynamic integration.

Parameters:

fp (InferenceFile) – An open file handler to read the stats from.
thin_start (int) – Index of the sample to begin returning stats. Default is to read stats after burn in. To start from the beginning set thin_start to 0.
thin_interval (int) – Interval to accept every i-th sample. Default is to use the fp.acl. If fp.acl is not set, then use all stats (set thin_interval to 1).
thin_end (int) – Index of the last sample to read. If not given then fp.niterations is used.

Returns:

lnZ (float) – The estimate of log of the evidence.
dlnZ (float) – The error on the estimate.

chain¶: Get all past samples as an ntemps x nwalker x niterations x ndim array.

clear_chain()[source]¶: Clears the chain and blobs from memory.

classmethod compute_acfs(fp, start_index=None, end_index=None, per_walker=False, walkers=None, parameters=None, temps=None)[source]¶

Computes the autocorrleation function of the model params in the given file.

By default, parameter values are averaged over all walkers at each iteration. The ACF is then calculated over the averaged chain for each temperature. An ACF per-walker will be returned instead if per_walker=True.

Parameters:	fp (InferenceFile) – An open file handler to read the samples from. start_index ({None, int}) – The start index to compute the acl from. If None, will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample. end_index ({None, int}) – The end index to compute the acl to. If None, will go to the end of the current iteration. per_walker (optional, bool) – Return the ACF for each walker separately. Default is False. walkers (optional, int or array) – Calculate the ACF using only the given walkers. If None (the default) all walkers will be used. parameters (optional, str or array) – Calculate the ACF for only the given parameters. If None (the default) will calculate the ACF for all of the model params. temps (optional, (list of) int or 'all') – The temperature index (or list of indices) to retrieve. If None (the default), the ACF will only be computed for the coldest (= 0) temperature chain. To compute an ACF for all temperates pass ‘all’, or a list of all of the temperatures.
Returns:	A `FieldArray` of the ACF vs iteration for each parameter. If `per-walker` is True, the FieldArray will have shape `ntemps x nwalkers x niterations`. Otherwise, the returned array will have shape `ntemps x niterations`.
Return type:	FieldArray

classmethod compute_acls(fp, start_index=None, end_index=None)[source]¶

Computes the autocorrleation length for all model params and temperatures in the given file.

Parameter values are averaged over all walkers at each iteration and temperature. The ACL is then calculated over the averaged chain. If the returned ACL is inf, will default to the number of current iterations.

Parameters:	fp (InferenceFile) – An open file handler to read the samples from. start_index ({None, int}) – The start index to compute the acl from. If None, will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample. end_index ({None, int}) – The end index to compute the acl to. If None, will go to the end of the current iteration.
Returns:	A dictionary of ntemps-long arrays of the ACLs of each parameter.
Return type:	dict

classmethod from_cli(opts, model, pool=None, model_call=None)[source]¶

Create an instance of this sampler from the given command-line options.

Parameters:	opts (ArgumentParser options) – The options to parse. model (LikelihoodEvaluator) – The model to use with the sampler.
Returns:	An emcee sampler initialized based on the given arguments.
Return type:	EmceePTSampler

lnpost¶: Get the natural logarithm of the likelihood + the prior as an ntemps x nwalkers x niterations array.

model_stats¶: Returns the log likelihood ratio and log prior as a FieldArray. The returned array has shape ntemps x nwalkers x niterations.

name = 'emcee_pt'¶

ntemps¶

static read_acceptance_fraction(fp, temps=None, walkers=None)[source]¶

Reads the acceptance fraction from the given file.

Parameters:	fp (InferenceFile) – An open file handler to read the samples from. temps ({None, (list of) int}) – The temperature index (or a list of indices) to retrieve. If None, acfs from all temperatures and all walkers will be retrieved. walkers ({None, (list of) int}) – The walker index (or a list of indices) to retrieve. If None, samples from all walkers will be obtained.
Returns:	Array of acceptance fractions with shape (requested temps, requested walkers).
Return type:	array

classmethod read_samples(fp, parameters, thin_start=None, thin_interval=None, thin_end=None, iteration=None, temps=0, walkers=None, flatten=True, samples_group=None, array_class=None)[source]¶

Reads samples for the given parameter(s).

Parameters:	fp (InferenceFile) – An open file handler to read the samples from. parameters ((list of) strings) – The parameter(s) to retrieve. A parameter can be the name of any field in `fp[fp.samples_group]`, a virtual field or method of `FieldArray` (as long as the file contains the necessary fields to derive the virtual field or method), and/or a function of these. thin_start (int) – Index of the sample to begin returning samples. Default is to read samples after burn in. To start from the beginning set thin_start to 0. thin_interval (int) – Interval to accept every i-th sample. Default is to use the `fp.acl`. If `fp.acl` is not set, then use all samples (set thin_interval to 1). thin_end (int) – Index of the last sample to read. If not given then `fp.niterations` is used. iteration (int) – Get a single iteration. If provided, will override the `thin_{start/interval/end}` arguments. walkers ({None, (list of) int}) – The walker index (or a list of indices) to retrieve. If None, samples from all walkers will be obtained. temps ({None, (list of) int, 'all'}) – The temperature index (or list of indices) to retrieve. If None, only samples from the coldest (= 0) temperature chain will be retrieved. To retrieve all temperates pass ‘all’, or a list of all of the temperatures. flatten ({True, bool}) – The returned array will be one dimensional, with all desired samples from all desired walkers concatenated together. If False, the returned array will have dimension requested temps x requested walkers x requested iterations. samples_group ({None, str}) – The group in `fp` from which to retrieve the parameter fields. If None, searches in `fp.samples_group`. array_class ({None, array class}) – The type of array to return. The class must have a `from_kwargs` class method and a `parse_parameters` method. If None, will return a FieldArray.
Returns:	Samples for the given parameters, as an instance of a the given `array_class` (`FieldArray` if `array_class` is None).
Return type:	array_class

run(niterations, **kwargs)[source]¶

Advance the ensemble for a number of samples.

Parameters:	niterations (int) – Number of samples to get from sampler.
Returns:	p (numpy.array) – An array of current walker positions with shape (nwalkers, ndim). lnpost (numpy.array) – The list of log posterior probabilities for the walkers at positions p, with shape (nwalkers, ndim). rstate – The current state of the random number generator.

set_p0(samples_file=None, prior=None)[source]¶

Sets the initial position of the walkers.

Parameters:	samples_file (InferenceFile, optional) – If provided, use the last iteration in the given file for the starting positions. prior (JointDistribution, optional) – Use the given prior to set the initial positions rather than `model`’s prior.
Returns:	p0 – An ntemps x nwalkers x ndim array of the initial positions that were set.
Return type:	array

write_acceptance_fraction(fp)[source]¶

Write acceptance_fraction data to file. Results are written to fp[acceptance_fraction/temp{k}] where k is the temperature.

Parameters:	fp (InferenceFile) – A file handler to an open inference file.

write_metadata(fp, **kwargs)[source]¶

Writes metadata about this sampler to the given file. Metadata is written to the file’s attrs.

Parameters:	fp (InferenceFile) – A file handler to an open inference file. **kwargs – All keyword arguments are saved as separate arguments in the file attrs. If any keyword argument is a dictionary, the keyword will point to the list of keys in the the file’s `attrs`. Each key is then stored as a separate attr with its corresponding value.

write_results(fp, start_iteration=None, max_iterations=None, **metadata)[source]¶

Writes metadata, samples, model stats, and acceptance fraction to the given file. See the write function for each of those for details.

Parameters:

fp (InferenceFile) – A file handler to an open inference file.
start_iteration (int, optional) – Write results to the file’s datasets starting at the given iteration. Default is to append after the last iteration in the file.
max_iterations (int, optional) – Set the maximum size that the arrays in the hdf file may be resized to. Only applies if the samples have not previously been written to file. The default (None) is to use the maximum size allowed by h5py.
**metadata – All other keyword arguments are passed to write_metadata.

static write_samples_group(fp, samples_group, parameters, samples, start_iteration=None, max_iterations=None)[source]¶

Writes samples to the given file.

Results are written to:

fp[samples_group/{vararg}],

where {vararg} is the name of a variable arg. The samples are written as an ntemps x nwalkers x niterations array.

Parameters:

fp (InferenceFile) – A file handler to an open inference file.
samples_group (str) – Name of samples group to write.
parameters (list) – The parameters to write to the file.
samples (FieldArray) – The samples to write. Should be a FieldArray with fields containing the samples to write and shape nwalkers x niterations.
start_iteration (int, optional) – Write results to the file’s datasets starting at the given iteration. Default is to append after the last iteration in the file.
max_iterations (int, optional) – Set the maximum size that the arrays in the hdf file may be resized to. Only applies if the samples have not previously been written to file. The default (None) is to use the maximum size allowed by h5py.