Numerical analysis routines¶

Contents:

My module for various data analysis tasks.

REQUIREMENTS:	`numpy`, Plotting and analysis tools (for `errxy()`)

2008-07-25 16:20 IJC: Created.

2009-12-08 11:31 IJC: Updated transit flag in planet objects and: rveph() function.

2010-02-18 14:06 IJC: Added medianfilter()

2010-08-03 15:38 IJC: Merged versions.

2010-10-28 11:53 IJMC: Updated documentation strings for Sphinx;: moved pylab import inside individual function.
2011-04-13 14:29 IJMC: Added keyword functionality to fmin(): (taken from scipy.optimize).

2011-04-14 09:48 IJMC: Added a few fundamental constants.

2011-04-22 14:48 IJMC: Added trueanomaly() and: eccentricanomaly().
2011-06-21 13:15 IJMC: Added get_t23() and get_t14() to: planet objects.

analysis.allanvariance(data, dt=1)[source]¶

Compute the Allan variance on a set of regularly-sampled data (1D).

If the time between samples is dt and there are N total samples, the returned variance spectrum will have frequency indices from 1/dt to (N-1)/dt.

analysis.amedian(a, axis=None)[source]¶

Median the array over the given axis. If the axis is None, median over all dimensions of the array.

Think of this as normal Numpy median, but preserving dimensionality.

analysis.binarray(img, ndown, axis=None)[source]¶

downsample a 2D image

Takes a 1D vector or 2D array and reduce resolution by an integer factor “ndown”. This is done by binning the array – i.e., integrating over square blocks of pixels of width “ndown”

If keyword “axis” is None, bin over all axes. Otherwise, bin over the single specified axis.

Note that ‘ndown’ can also be a sequence: e.g., [2, 1]

EXAMPLE:	[img_ds]=binarray(img,ndown)

analysis.cheby2poly(cin)[source]¶

Convert chebychev coefficients to ‘normal’ polyval coefficients .

INPUT:	chebyt coefficients
OUTPUT:	poly coefficients (e.g., for use w/polyval)
SEE ALSO:	`poly2cheby()`, `gpolyval()`; scipy.special.chebyt

analysis.confmap(map, frac, **kw)[source]¶

Return the confidence level of a 2D histogram or array that encloses the specified fraction of the total sum.

INPUTS:	map : 1D or 2D numpy array Probability map (from hist2d or kde) frac : float, 0 <= frac <= 1 desired fraction of enclosed energy of map
OPTIONS:	ordinate : None or 1D array If 1D map, interpolates onto the desired value. This could cause problems when you aren’t just setting upper/lower limits....
SEE_ALSO:	`dumbconf()` for 1D distributions

analysis.dopspec(starspec, planetspec, starrv, planetrv, disp, starphase=, []planetphase=, []wlscale=True)[source]¶

Generate combined time series spectra using planet and star models, planet and star RV profiles.

D = dopspec(sspec, pspec, sRV, pRV, disp, sphase=[], pphase=[])

INPUTS:

sspec, pspec: star, planet spectra; must be on a common: logarithmic wavelength grid

sRV, pRV: star, planet radial velocities in m/s disp: constant logarithmic dispersion of the wavelength

grid: LAMBDA_i/LAMBDA_(i-1)

OPTIONAL INPUTS:

sphase, pphase: normalized phase functions of star and planet.: The inputs sspec and pspec will be scaled by these values for each observation.

wlscale: return relative wavelength scale for new data

NOTE: Input spectra must be linearly spaced in log wavelength and increasing:: that is, they must have [lambda_i / lambda_(i-1)] = disp = constant > 1

Positive velocities are directed AWAY from the observer.

analysis.doubleGaussian(p, x)[source]¶

Compute the sum of two gaussian distributions at the points x.

p is a six- or seven-component sequence:

y = [p6 +] p0/(p1*sqrt(2pi)) * exp(-(x-p2)**2 / (2*p1**2)) +: p3/(p4*sqrt(2pi)) * exp(-(x-p5)**2 / (2*p4**2))

p[0] – Area of gaussian A

p[1] – one-sigma dispersion of gaussian A

p[2] – central offset (mean location) of gaussian A

p[3] – Area of gaussian B

p[4] – one-sigma dispersion of gaussian B

p[5] – central offset (mean location) of gaussian B

p[6] – optional constant, vertical offset

NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1

SEE ALSO: gaussian2d()

analysis.lsq(x, z, w=None, retcov=False, checkvals=True)[source]¶

Do weighted least-squares fitting.

INPUTS:

x : sequence: tuple of 1D vectors of equal lengths N, or the transposed numpy.vstack of this tuple
z : sequence: vector of length N; data to fit to.
w : sequence: Either an N-vector or NxN array of weights (e.g., 1./sigma_z**2)
retcov : bool.: If True, also return covariance matrix.
checkvals : bool: If True, check that all values are finite values. This is safer, but the array-based indexing slows down the function.

RETURNS:

the tuple of (coef, coeferrs, {cov_matrix})

analysis.lsqsp(x, z, w=None, retcov=False)[source]¶

Do weighted least-squares fitting on sparse matrices.

INPUTS:	x : sparse matrix, shape N x M data used in the least-squares fitting, a set of N rows of M elements each. z : sequence (shape N) or sparse matrix (shape N x 1) data to fit to. w : sequence (shape N) or sparse matrix (shape N x N) Data weights and/or inverse covariances (e.g., 1./sigma_z**2) #retcov : bool. # If True, also return covariance matrix.
RETURNS:	the tuple of (coef, coeferrs, {cov_matrix})
SEE_ALSO:	`lsq()`
REQUIREMENTS:	SciPy’s sparse module.

analysis.meanr(x, nsigma=3, niter=inf, finite=True, verbose=False, axis=None)[source]¶

Return the mean of an array after removing outliers.

INPUTS:	x – (array) data set to find mean of
OPTIONAL INPUT:	nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean.
EXAMPLE:	from numpy import * from analysis import meanr x = concatenate((randn(200),[1000])) print mean(x), meanr(x, nsigma=3) x = concatenate((x,[nan,inf])) print mean(x), meanr(x, nsigma=3)

SEE ALSO: medianr(), stdr(), removeoutliers(),: numpy.isfinite()

analysis.medianfilter(data, filtersize, threshold=None, verbose=False)[source]¶

For now, we assume FILTERSIZE is odd, and that DATA is square!

filt = medianfilter(data, filtersize)

Note that filtersize can be a scalar (e.g., 5) to equally median-filter along both axes, or a 2-vector (e.g., [5, 1]) to apply a rectangular median-filter.

This is about the slowest way to do it, but it was easy to write.

analysis.medianr(x, nsigma=3, niter=inf, finite=True, verbose=False, axis=None)[source]¶

Return the median of an array after removing outliers.

INPUTS:	x – (array) data set to find median of
OPTIONAL INPUT:	nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean.
EXAMPLE:	from numpy import * from analysis import medianr x = concatenate((randn(200),[1000])) print median(x), medianr(x, nsigma=3) x = concatenate((x,[nan,inf])) print median(x), medianr(x, nsigma=3)

SEE ALSO: meanr(), stdr(), removeoutliers(),: numpy.isfinite()

analysis.mjd(date)[source]¶

Convert Julian Date to Modified Julian Date, or vice versa.

if date>=2400000.5, add 2400000.5 if date<2400000.5, subtract 2400000.5

analysis.nGaussianCen(p, x, mu)[source]¶

Compute the sum of N gaussian distributions at the points x. The distributions have central moments defined by the vector mu.

Useful for fitting to partially blended spectral data when you have good measurements of positions (i.e., from 2D tracing).

p is a sequence of length (2N+1). If N=2:

y = [p6 +] p0/(p1*sqrt(2pi)) * exp(-(x-mu1)**2 / (2*p1**2)) +: p3/(p4*sqrt(2pi)) * exp(-(x-mu2)**2 / (2*p4**2))

p[0] – Area of gaussian 1 p[1] – one-sigma dispersion of gaussian 1 p[2] – Area of gaussian 2 p[3] – one-sigma dispersion of gaussian 2

... etc.

p[-1] – optional constant, vertical offset: and

mu1 – central offset (mean location) of gaussian A mu2 – central offset (mean location) of gaussian B

NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1

analysis.neworder(N)[source]¶: Generate a random order the integers [0, N-1] inclusive.

analysis.pad(inp, npix_rows, npix_cols=None)[source]¶

Pads input matrix to size specified.

out = pad(in, npix)     
out = pad(in, npix_rows, npix_cols);  # alternate usage

Written by J. Green @ JPL; converted to Python by I. Crossfield

analysis.pb_helperfunction(inputs)[source]¶: Helper function for prayerbead(). Not for general use.

class analysis.planet(*args)[source]¶

Very handy planet object.

Best initialized using getobj().

REQUIREMENTS:	Database file exoplanets.csv from http://exoplanets.org/

get_t14(*args)[source]¶

Compute total transit duration (in days) for a transiting planet.

Returns:: nan if required fields are missing.

Using Eq. 14 of J. Winn’s chapter in S. Seager’s book “Exoplanets.”

SEE ALSO:	`get_t23()`

get_t23(*args)[source]¶

Compute full transit duration (in days) for a transiting planet.

Returns:: nan if required fields are missing.

Using Eq. 15 of J. Winn’s chapter in S. Seager’s book “Exoplanets.”

SEE ALSO:	`get_t14()`

get_teq(ab, f, reterr=False)[source]¶

Compute equilibrium temperature.

INPUTS:	ab : scalar, 0 <= ab <= 1 Bond albedo. f : scalar, 0.25 <= ab <= 2/3. Recirculation efficiency. A value of 0.25 indicates full redistribution of incident heat, while 2/3 indicates zero redistribution.
EXAMPLE:	import analysis planet = analysis.getobj('HD 189733 b') planet.get_teq(0.0, 0.25) # zero albedo, full recirculation
REFERENCE:	Seager, “Exoplanets,” 2010. Eq. 3.9

phase(hjd)[source]¶

Get phase of an orbiting planet.

refer to function analysis.getorbitalphase for full documentation.

rv(**kw)[source]¶

Compute radial velocity on a planet object for given Julian Date.

EXAMPLE:	import analysis p = analysis.getobj('HD 189733 b') jd = 2400000. print p.rv(jd)

refer to function analysis.rv for full documentation.

rveph(jd)[source]¶

Compute the most recently elapsed RV emphemeris of a given planet at a given JD. RV ephemeris is defined by the having radial velocity equal to zero.

refer to analysis.rv() for full documentation.

SEE ALSO: analysis.getobj(), analysis.phase()

writetext(filename, **kw)[source]¶: See analysis.planettext()

analysis.planettext(planets, filename, delimiter=', ', append=True)[source]¶

Write planet object info into a delimited line of text.

INPUTS:

planets : planet object or list thereof

filename : str

delimiter : str

analysis.poly2cheby(cin)[source]¶

Convert straight monomial coefficients to chebychev coefficients.

INPUT: poly coefficients (e.g., for use w/polyval) OUTPUT: chebyt coefficients

SEE ALSO: gpolyval(); scipy.special.chebyt

analysis.polyfitr(x, y, N, s, fev=100, w=None, diag=False, clip='both', verbose=False, plotfit=False, plotall=False, eps=1e-13, catchLinAlgError=False)[source]¶

Matplotlib’s polyfit with weights and sigma-clipping rejection.

DESCRIPTION:	Do a best fit polynomial of order N of y to x. Points whose fit residuals exeed s standard deviations are rejected and the fit is recalculated. Return value is a vector of polynomial coefficients [pk ... p1 p0].
OPTIONS:	w: a set of weights for the data; uses CARSMath’s weighted polynomial fitting routine instead of numpy’s standard polyfit. fev: number of function evaluations to call before stopping ‘diag’nostic flag: Return the tuple (p, chisq, n_iter) clip: ‘both’ – remove outliers +/- ‘s’ sigma from fit ‘above’ – remove outliers ‘s’ sigma above fit ‘below’ – remove outliers ‘s’ sigma below fit catchLinAlgError : bool If True, don’t bomb on LinAlgError; instead, return [0, 0, ... 0].
REQUIREMENTS:	`CARSMath`
NOTES:	Iterates so long as n_newrejections>0 AND n_iter<fev.

analysis.prayerbead(*arg, **kw)[source]¶

Generic function to perform Prayer-Bead (residual permutation) analysis.

OPTIONAL INPUTS:
INPUTS:	(fitparams, modelfunction, arg1, arg2, ... , data, weights) OR: (allparams, (args1, args2, ..), npars=(npar1, npar2, ...)) where allparams is an array concatenation of each functions input parameters.
	jointpars – list of 2-tuples. For use with multi-function calling (w/npars keyword). Setting jointpars=[(0,10), (0,20)] will always set params[10]=params[0] and params[20]=params[0]. parinfo – None, or list of dicts ‘parinfo’ to pass to the kapteyn.py kpmfit routine. gaussprior – list of 2-tuples, same length as “allparams.” The i^th tuple (x_i, s_i) imposes a Gaussian prior on the i^th parameter p_i by adding ((p_i - x_i)/s_i)^2 to the total chi-squared. axis – int or None If input is 2D, which axis to permute over. (NOT YET IMPLEMENTED!) step – int > 0 Stepsize for permutation steps. 1 by default. (NOT YET IMPLEMENTED!) verbose – bool Print various status lines to console. maxiter – int Maximum number of iterations for _each_ fitting step. maxfun – int Maximum number of function evaluations for _each_ fitting step. threads – int Number of threads to use (via multiprocessing.Pool)
EXAMPLE:	TBW
REQUIREMENTS:	`kapteyn`, Planetary phase curve routines, `numpy`

analysis.removeoutliers(data, nsigma, remove='both', center='mean', niter=inf, retind=False, verbose=False)[source]¶

Strip outliers from a dataset, iterating until converged.

INPUT:

data – 1D numpy array. data from which to remove outliers.

nsigma – positive number. limit defining outliers: number of: standard deviations from center of data.

OPTIONAL INPUTS:

remove – (‘min’|’max’|’both’) respectively removes outliers: below, above, or on both sides of the limits set by nsigma.
center – (‘mean’|’median’|value) – set central value, or: method to compute it.
niter – number of iterations before exit; defaults to Inf,: which can occasionally result in empty arrays returned
retind – (bool) whether to return index of good values as: second part of a 2-tuple.

EXAMPLE:

from numpy import hist, linspace, randn
from analysis import removeoutliers
data = randn(1000)
hbins = linspace(-5,5,50)
d2 = removeoutliers(data, 1.5, niter=1)
hist(data, hbins)
hist(d2, hbins)

analysis.returnSections(time, dtmax=0.1)[source]¶

Return 2-tuples that are the indices of separate sections, as indicated by breaks in a continuous and always-increasing time series.

INPUTS:

time : 1D NumPy array: The time index of interest. Should be always increasing, such that numpy.diff(time) is always positive.
dtmax : float: Any break in ‘time’ equal to or larger than this indicates a new segment.

EXAMPLE:

import transit

# Simulate a time series with 30-minute sampling:
t1 = np.arange(0, 3.7, 0.5/24)
t2 = np.arange(5, 70, 0.5/24)
t3 = np.arange(70.2, 85, 0.5/24)
days = np.concatenate((t1, t2, t3))
ret = transit.returnSections(days, dtmax=0.1)

# If each segment was correctly identified, these print 'True':
print (t1==days[ret[0][0]:ret[0][1]+1]).all()
print (t2==days[ret[1][0]:ret[1][1]+1]).all()
print (t3==days[ret[2][0]:ret[2][1]+1]).all()

analysis.rv(p, jd=None, e=None, reteanom=False, tol=1e-08)[source]¶

Compute unprojected astrocentric RV of a planet for a given JD in m/s.

INPUTS:	p : planet object, or 5- or 6-sequence planet object: see `get_obj()` OR: sequence: [period, t_peri, ecc, a, long_peri, gamma] (if gamma is omitted, it is set to zero) (long_peri should be in radians!) jd : NumPy array Dates of observation (in same time system as t_peri). e : NumPy array Eccentric Anomaly of observations (can be precomputed to save time)
EXAMPLE:	jd = 2454693 # date: 2008/08/14 p = getobj('55 Cnc e') # planet: 55 Cnc e vp = rv(p, jd) returns vp ~ 1.47e5 [m/s]

The result will need to be multiplied by the sine of the inclination angle (i.e., “sin i”). Positive radial velocities are directed _AWAY_ from the observer.

To compute the barycentric radial velocity of the host star, scale the returned values by the mass ratio -m/(m+M).

OPTIONAL INPUTS:
INPUTS:	vec : sequence 1D Vector of data values, for which confidence levels will be computed. sig : scalar Confidence level, 0 < sig < 1. If type=’central’, we return the value X for which the range (mid-X, mid+x) encloses a fraction sig of the data values.
	type=’central’ – ‘upper’, ‘lower’, or ‘central’ confidence limits mid=’mean’ – compute middle with mean or median
SEE_ALSO:	`confmap()` for 2D distributions
EXAMPLES:	from numpy import random from analysis import dumbconf x = random.randn(10000) dumbconf(x, 0.683) # ---> 1.0 (one-sigma) dumbconf(3*x, 0.954) # ---> 6.0 (two-sigma) dumbconf(x+2, 0.997, type='lower') # ---> -0.74 dumbconf(x+2, 0.997, type='upper') # ---> 4.7

type	confidence level
one-sigma	0.6826895
2 sigma	0.9544997
3 sigma	0.9973002
4 sigma	0.9999366
5 sigma	0.9999994

OPTIONAL_INPUTS:
INPUT:	ecc – scalar. orbital eccentricity.
	manom – scalar or sequence. Mean anomaly, equal to 2pi(t - t0)/period tanom – scalar or Numpy array. True anomaly. See `trueanomaly()`.

TBD:	gprior : tuple or sequence of tuples Set a gaussian prior on the indicated parameter, such that chisq += ((x0[p] - val)/unc_val)**2, where the parameters are defined by the tuple gprior=(param, val, unc_val)
Notes:	Uses a Nelder-Mead simplex algorithm to find the minimum of function of one or more variables.

OPTIONAL_INPUTS:
INPUTS:	EITHER: func : function to generate model. First argument must be “params;” subsequent arguments are passed in via the “args” keyword params : 1D sequence parameters to be fit stepsize : 1D or 2D array If 1D: 1-sigma change in parameter per iteration If 2D: covariance matrix for parameter changes. z : 1D array Contains dependent data (to be modeled) sigma : 1D array Contains standard deviation (errors) of “z” data numit : int Number of iterations to perform OR: (allparams, (arg1, arg2, ...), numit) where allparams is a concatenated list of parameters for each of several functions, and the arg_i are tuples of (func_i, stepsize_i, z_i, sigma_i). In this case the keyword ‘args’ must also be a tuple of sequences, one for each function to be MCMC’ed.
	args : 1D sequence Second, third, etc.... arguments to “func” nstep : int Saves every “nth” step of the chain posdef : None, ‘all’, or sequences of indices. Which elements should be restricted to positive definite? If indices, it should be of the form (e.g.): [0, 1, 4] holdfixed : None, or sequences of indices. Which elements should be held fixed in the analysis? If indices, it should be of the form (e.g.): [0, 1, 4] jointpars : None, or sequence of 2-tuples. Only for simultaneous multi-function fitting. For each pair of values passed, we set the parameters values so: allparams[pair[1]] = allparams[pair[0]]
OUTPUTS:	allparams : 2D array Contains all parameters at each step bestp : 1D array Contains best paramters as determined by lowest Chi^2 numaccept: int Number of accepted steps chisq: 1D array Chi-squared value at each step
REFERENCES:	Numerical Recipes, 3rd Edition (Section 15.8) Wikipedia
NOTES:	If you need an efficient MCMC algorithm, you should be using https://emcee.readthedocs.io/en/v2.2.1/

Numerical analysis routines¶

Previous topic

Next topic

This Page

OPTIONAL INPUTS:
INPUTS:	(str) – planet name, e.g. “HD 189733 b”
	datafile : str datafile name verbose : bool verbosity flag
EXAMPLE:	p = getobj('55cnce') p.period ---> 2.81705

INPUTS:	x – (array) data set to find std of
OPTIONAL INPUT:	nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean.
EXAMPLE:	from numpy import * from analysis import stdr x = concatenate((randn(200),[1000])) print std(x), stdr(x, nsigma=3) x = concatenate((x,[nan,inf])) print std(x), stdr(x, nsigma=3)

OPTIONAL_INPUTS:
INPUT:	ecc – scalar. orbital eccentricity.
	eanom – scalar or Numpy array. Eccentric anomaly. See `eccentricanomaly()` manom – scalar or sequence. Mean anomaly, equal to 2pi(t - t0)/period

INPUTS:	a : sequence or Numpy array data for which weighted mean is computed w : sequence or Numpy array weights of data – e.g., 1./sigma^2 reterr : bool If True, return the tuple (mean, err_on_mean), where err_on_mean is the unbiased estimator of the sample standard deviation.
SEE ALSO:	`wstd()`

Navigation

Numerical analysis routines¶

Previous topic

Next topic

This Page

Quick search

Navigation