gwin.sampler.emcee module¶
This modules provides classes and functions for using the emcee sampler packages for parameter estimation.
-
class
gwin.sampler.emcee.
EmceeEnsembleSampler
(model, nwalkers, pool=None, model_call=None)[source]¶ Bases:
gwin.sampler.base.BaseMCMCSampler
This class is used to construct an MCMC sampler from the emcee package’s EnsembleSampler.
Parameters: - model (model) – A model from
gwin.models
. - nwalkers (int) – Number of walkers to use in sampler.
- pool (function with map, Optional) – A provider of a map function that allows a function call to be run over multiple sets of arguments and possibly maps them to cores/nodes/etc.
-
chain
¶ Get all past samples as an nwalker x niterations x ndim array.
-
classmethod
from_cli
(opts, model, pool=None, model_call=None)[source]¶ Create an instance of this sampler from the given command-line options.
Parameters: - opts (ArgumentParser options) – The options to parse.
- model (LikelihoodEvaluator) – The model to use with the sampler.
Returns: An emcee sampler initialized based on the given arguments.
Return type:
-
lnpost
¶ Get the natural logarithm of the likelihood as an nwalkers x niterations array.
-
name
= 'emcee'¶
-
run
(niterations, **kwargs)[source]¶ Advance the ensemble for a number of samples.
Parameters: niterations (int) – Number of samples to get from sampler. Returns: - p (numpy.array) – An array of current walker positions with shape (nwalkers, ndim).
- lnpost (numpy.array) – The list of log posterior probabilities for the walkers at positions p, with shape (nwalkers, ndim).
- rstate – The current state of the random number generator.
-
set_p0
(samples_file=None, prior=None)[source]¶ Sets the initial position of the walkers.
Parameters: - samples_file (InferenceFile, optional) – If provided, use the last iteration in the given file for the starting positions.
- prior (JointDistribution, optional) – Use the given prior to set the initial positions rather than
model
’s prior.
Returns: p0 – An nwalkers x ndim array of the initial positions that were set.
Return type: array
-
set_state_from_file
(fp)[source]¶ Sets the state of the sampler back to the instance saved in a file.
-
write_results
(fp, start_iteration=None, max_iterations=None, **metadata)[source]¶ Writes metadata, samples, model stats, and acceptance fraction to the given file. See the write function for each of those for details.
Parameters: - fp (InferenceFile) – A file handler to an open inference file.
- start_iteration (int, optional) – Write results to the file’s datasets starting at the given iteration. Default is to append after the last iteration in the file.
- max_iterations (int, optional) – Set the maximum size that the arrays in the hdf file may be resized to. Only applies if the samples have not previously been written to file. The default (None) is to use the maximum size allowed by h5py.
- **metadata – All other keyword arguments are passed to
write_metadata
.
- model (model) – A model from
-
class
gwin.sampler.emcee.
EmceePTSampler
(model, ntemps, nwalkers, pool=None, model_call=None)[source]¶ Bases:
gwin.sampler.base.BaseMCMCSampler
This class is used to construct a parallel-tempered MCMC sampler from the emcee package’s PTSampler.
Parameters: - model (model) – A model from
gwin.models
. - ntemps (int) – Number of temeratures to use in the sampler.
- nwalkers (int) – Number of walkers to use in sampler.
- pool (function with map, Optional) – A provider of a map function that allows a function call to be run over multiple sets of arguments and possibly maps them to cores/nodes/etc.
-
classmethod
calculate_logevidence
(fp, thin_start=None, thin_end=None, thin_interval=None)[source]¶ Calculates the log evidence from the given file using emcee’s thermodynamic integration.
Parameters: - fp (InferenceFile) – An open file handler to read the stats from.
- thin_start (int) – Index of the sample to begin returning stats. Default is to read stats after burn in. To start from the beginning set thin_start to 0.
- thin_interval (int) – Interval to accept every i-th sample. Default is to use the
fp.acl
. Iffp.acl
is not set, then use all stats (set thin_interval to 1). - thin_end (int) – Index of the last sample to read. If not given then
fp.niterations
is used.
Returns: - lnZ (float) – The estimate of log of the evidence.
- dlnZ (float) – The error on the estimate.
-
chain
¶ Get all past samples as an ntemps x nwalker x niterations x ndim array.
-
classmethod
compute_acfs
(fp, start_index=None, end_index=None, per_walker=False, walkers=None, parameters=None, temps=None)[source]¶ Computes the autocorrleation function of the model params in the given file.
By default, parameter values are averaged over all walkers at each iteration. The ACF is then calculated over the averaged chain for each temperature. An ACF per-walker will be returned instead if
per_walker=True
.Parameters: - fp (InferenceFile) – An open file handler to read the samples from.
- start_index ({None, int}) – The start index to compute the acl from. If None, will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample.
- end_index ({None, int}) – The end index to compute the acl to. If None, will go to the end of the current iteration.
- per_walker (optional, bool) – Return the ACF for each walker separately. Default is False.
- walkers (optional, int or array) – Calculate the ACF using only the given walkers. If None (the default) all walkers will be used.
- parameters (optional, str or array) – Calculate the ACF for only the given parameters. If None (the default) will calculate the ACF for all of the model params.
- temps (optional, (list of) int or 'all') – The temperature index (or list of indices) to retrieve. If None (the default), the ACF will only be computed for the coldest (= 0) temperature chain. To compute an ACF for all temperates pass ‘all’, or a list of all of the temperatures.
Returns: A
FieldArray
of the ACF vs iteration for each parameter. Ifper-walker
is True, the FieldArray will have shapentemps x nwalkers x niterations
. Otherwise, the returned array will have shapentemps x niterations
.Return type: FieldArray
-
classmethod
compute_acls
(fp, start_index=None, end_index=None)[source]¶ Computes the autocorrleation length for all model params and temperatures in the given file.
Parameter values are averaged over all walkers at each iteration and temperature. The ACL is then calculated over the averaged chain. If the returned ACL is
inf
, will default to the number of current iterations.Parameters: - fp (InferenceFile) – An open file handler to read the samples from.
- start_index ({None, int}) – The start index to compute the acl from. If None, will try to use the number of burn-in iterations in the file; otherwise, will start at the first sample.
- end_index ({None, int}) – The end index to compute the acl to. If None, will go to the end of the current iteration.
Returns: A dictionary of ntemps-long arrays of the ACLs of each parameter.
Return type:
-
classmethod
from_cli
(opts, model, pool=None, model_call=None)[source]¶ Create an instance of this sampler from the given command-line options.
Parameters: - opts (ArgumentParser options) – The options to parse.
- model (LikelihoodEvaluator) – The model to use with the sampler.
Returns: An emcee sampler initialized based on the given arguments.
Return type:
-
lnpost
¶ Get the natural logarithm of the likelihood + the prior as an ntemps x nwalkers x niterations array.
-
model_stats
¶ Returns the log likelihood ratio and log prior as a FieldArray. The returned array has shape ntemps x nwalkers x niterations.
-
name
= 'emcee_pt'¶
-
ntemps
¶
-
static
read_acceptance_fraction
(fp, temps=None, walkers=None)[source]¶ Reads the acceptance fraction from the given file.
Parameters: - fp (InferenceFile) – An open file handler to read the samples from.
- temps ({None, (list of) int}) – The temperature index (or a list of indices) to retrieve. If None, acfs from all temperatures and all walkers will be retrieved.
- walkers ({None, (list of) int}) – The walker index (or a list of indices) to retrieve. If None, samples from all walkers will be obtained.
Returns: Array of acceptance fractions with shape (requested temps, requested walkers).
Return type: array
-
classmethod
read_samples
(fp, parameters, thin_start=None, thin_interval=None, thin_end=None, iteration=None, temps=0, walkers=None, flatten=True, samples_group=None, array_class=None)[source]¶ Reads samples for the given parameter(s).
Parameters: - fp (InferenceFile) – An open file handler to read the samples from.
- parameters ((list of) strings) – The parameter(s) to retrieve. A parameter can be the name of any
field in
fp[fp.samples_group]
, a virtual field or method ofFieldArray
(as long as the file contains the necessary fields to derive the virtual field or method), and/or a function of these. - thin_start (int) – Index of the sample to begin returning samples. Default is to read samples after burn in. To start from the beginning set thin_start to 0.
- thin_interval (int) – Interval to accept every i-th sample. Default is to use the
fp.acl
. Iffp.acl
is not set, then use all samples (set thin_interval to 1). - thin_end (int) – Index of the last sample to read. If not given then
fp.niterations
is used. - iteration (int) – Get a single iteration. If provided, will override the
thin_{start/interval/end}
arguments. - walkers ({None, (list of) int}) – The walker index (or a list of indices) to retrieve. If None, samples from all walkers will be obtained.
- temps ({None, (list of) int, 'all'}) – The temperature index (or list of indices) to retrieve. If None, only samples from the coldest (= 0) temperature chain will be retrieved. To retrieve all temperates pass ‘all’, or a list of all of the temperatures.
- flatten ({True, bool}) – The returned array will be one dimensional, with all desired samples from all desired walkers concatenated together. If False, the returned array will have dimension requested temps x requested walkers x requested iterations.
- samples_group ({None, str}) – The group in
fp
from which to retrieve the parameter fields. If None, searches infp.samples_group
. - array_class ({None, array class}) – The type of array to return. The class must have a
from_kwargs
class method and aparse_parameters
method. If None, will return a FieldArray.
Returns: Samples for the given parameters, as an instance of a the given
array_class
(FieldArray
ifarray_class
is None).Return type: array_class
-
run
(niterations, **kwargs)[source]¶ Advance the ensemble for a number of samples.
Parameters: niterations (int) – Number of samples to get from sampler. Returns: - p (numpy.array) – An array of current walker positions with shape (nwalkers, ndim).
- lnpost (numpy.array) – The list of log posterior probabilities for the walkers at positions p, with shape (nwalkers, ndim).
- rstate – The current state of the random number generator.
-
set_p0
(samples_file=None, prior=None)[source]¶ Sets the initial position of the walkers.
Parameters: - samples_file (InferenceFile, optional) – If provided, use the last iteration in the given file for the starting positions.
- prior (JointDistribution, optional) – Use the given prior to set the initial positions rather than
model
’s prior.
Returns: p0 – An ntemps x nwalkers x ndim array of the initial positions that were set.
Return type: array
-
write_acceptance_fraction
(fp)[source]¶ Write acceptance_fraction data to file. Results are written to
fp[acceptance_fraction/temp{k}]
where k is the temperature.Parameters: fp (InferenceFile) – A file handler to an open inference file.
-
write_metadata
(fp, **kwargs)[source]¶ Writes metadata about this sampler to the given file. Metadata is written to the file’s
attrs
.Parameters: - fp (InferenceFile) – A file handler to an open inference file.
- **kwargs – All keyword arguments are saved as separate arguments in the
file attrs. If any keyword argument is a dictionary, the keyword
will point to the list of keys in the the file’s
attrs
. Each key is then stored as a separate attr with its corresponding value.
-
write_results
(fp, start_iteration=None, max_iterations=None, **metadata)[source]¶ Writes metadata, samples, model stats, and acceptance fraction to the given file. See the write function for each of those for details.
Parameters: - fp (InferenceFile) – A file handler to an open inference file.
- start_iteration (int, optional) – Write results to the file’s datasets starting at the given iteration. Default is to append after the last iteration in the file.
- max_iterations (int, optional) – Set the maximum size that the arrays in the hdf file may be resized to. Only applies if the samples have not previously been written to file. The default (None) is to use the maximum size allowed by h5py.
- **metadata – All other keyword arguments are passed to
write_metadata
.
-
static
write_samples_group
(fp, samples_group, parameters, samples, start_iteration=None, max_iterations=None)[source]¶ Writes samples to the given file.
Results are written to:
fp[samples_group/{vararg}]
,where
{vararg}
is the name of a variable arg. The samples are written as anntemps x nwalkers x niterations
array.Parameters: - fp (InferenceFile) – A file handler to an open inference file.
- samples_group (str) – Name of samples group to write.
- parameters (list) – The parameters to write to the file.
- samples (FieldArray) – The samples to write. Should be a FieldArray with fields containing the samples to write and shape nwalkers x niterations.
- start_iteration (int, optional) – Write results to the file’s datasets starting at the given iteration. Default is to append after the last iteration in the file.
- max_iterations (int, optional) – Set the maximum size that the arrays in the hdf file may be resized to. Only applies if the samples have not previously been written to file. The default (None) is to use the maximum size allowed by h5py.
- model (model) – A model from