Contents:
My module for various data analysis tasks.
REQUIREMENTS: | numpy, Plotting and analysis tools (for errxy()) |
---|
2008-07-25 16:20 IJC: Created.
2010-02-18 14:06 IJC: Added medianfilter()
2010-08-03 15:38 IJC: Merged versions.
2011-04-14 09:48 IJMC: Added a few fundamental constants.
Compute the Allan variance on a set of regularly-sampled data (1D).
If the time between samples is dt and there are N total samples, the returned variance spectrum will have frequency indices from 1/dt to (N-1)/dt.
Median the array over the given axis. If the axis is None, median over all dimensions of the array.
Think of this as normal Numpy median, but preserving dimensionality.
downsample a 2D image
Takes a 1D vector or 2D array and reduce resolution by an integer factor “ndown”. This is done by binning the array – i.e., integrating over square blocks of pixels of width “ndown”
If keyword “axis” is None, bin over all axes. Otherwise, bin over the single specified axis.
Note that ‘ndown’ can also be a sequence: e.g., [2, 1]
EXAMPLE: | [img_ds]=binarray(img,ndown)
|
---|
Convert chebychev coefficients to ‘normal’ polyval coefficients .
INPUT: | chebyt coefficients |
---|---|
OUTPUT: | poly coefficients (e.g., for use w/polyval) |
SEE ALSO: | poly2cheby(), gpolyval(); scipy.special.chebyt |
Return the confidence level of a 2D histogram or array that encloses the specified fraction of the total sum.
INPUTS: |
|
---|---|
OPTIONS: |
|
SEE_ALSO: | dumbconf() for 1D distributions |
Generate combined time series spectra using planet and star models, planet and star RV profiles.
D = dopspec(sspec, pspec, sRV, pRV, disp, sphase=[], pphase=[])
INPUTS: |
sRV, pRV: star, planet radial velocities in m/s disp: constant logarithmic dispersion of the wavelength
|
---|---|
OPTIONAL INPUTS: | |
wlscale: return relative wavelength scale for new data |
that is, they must have [lambda_i / lambda_(i-1)] = disp = constant > 1
Positive velocities are directed AWAY from the observer.
Compute the sum of two gaussian distributions at the points x.
p is a six- or seven-component sequence:
p[0] – Area of gaussian A
p[1] – one-sigma dispersion of gaussian A
p[2] – central offset (mean location) of gaussian A
p[3] – Area of gaussian B
p[4] – one-sigma dispersion of gaussian B
p[5] – central offset (mean location) of gaussian B
p[6] – optional constant, vertical offset
NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1
SEE ALSO: gaussian()
Compute the sum of two gaussian distributions at the points x. The distributions have central moments mu1 and mu2.
Useful for fitting to partially blended spectral data.
p is a four- or five-component sequence:
p[0] – Area of gaussian A p[1] – one-sigma dispersion of gaussian A p[2] – Area of gaussian B p[3] – one-sigma dispersion of gaussian B p[4] – optional constant, vertical offset
mu1 – central offset (mean location) of gaussian A mu2 – central offset (mean location) of gaussian B
NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1
SEE ALSO: doubleGaussian(), gaussian()
Determine two-sided and one-sided confidence limits, using sorting.
INPUTS: |
|
---|---|
OPTIONAL INPUTS: | |
type=’central’ – ‘upper’, ‘lower’, or ‘central’ confidence limits mid=’mean’ – compute middle with mean or median |
|
SEE_ALSO: | confmap() for 2D distributions |
EXAMPLES: | from numpy import random
from analysis import dumbconf
x = random.randn(10000)
dumbconf(x, 0.683) # ---> 1.0 (one-sigma)
dumbconf(3*x, 0.954) # ---> 6.0 (two-sigma)
dumbconf(x+2, 0.997, type='lower') # ---> -0.74
dumbconf(x+2, 0.997, type='upper') # ---> 4.7
|
Some typical values for a Normal (Gaussian) distribution:
type confidence level one-sigma 0.6826895 2 sigma 0.9544997 3 sigma 0.9973002 4 sigma 0.9999366 5 sigma 0.9999994
Calculate (Keplerian, orbital) eccentric anomaly.
One optional input must be given.
INPUT: | ecc – scalar. orbital eccentricity. |
---|---|
OPTIONAL_INPUTS: | |
|
Compute the deviation between the values y and the gaussian defined by p, x:
p is a three- or four-component array, list, or tuple.
Returns: y - p3 - p0/(p1*sqrt(2pi)) * exp(-(x-p2)**2 / (2*p1**2))
if an error array, e (typ. one-sigma) is entered, the returned value is divided by e.
SEE ALSO: gaussian()
Return the error associated with a 2D gaussian fit, using gaussian2d.
w is an array of weights, typically 1./sigma**2
Calculates the dropoff in measurement uncertainty with increasing number of samples (a random and uncorrelated set of data should drop of as 1/sqrt[n] ).
E(0), the first returned element, always returns the uncertainty in a single measurement (i.e., the standard deviation).
EXAMPLE: | Compute the errors on an array of 3 column vectors data = randn(1000,3)
e = error_dropoff(data)
plot(e[1], 1./sqrt(e[1]), '-', e[1], e[0], 'x')
xlim([0,30])
legend(['Theoretical [1/sqrt(N)]', 'Computed errors'])
|
---|
Apply a hard-edged low-pass filter to an input vector.
INPUTS: | vec – sequence – 1D vector, assumed to be evenly sampled
retfilter – bool – if True, return the tuple (filtered_vec, filter) |
---|---|
OUTPUT: | Lopass-filtered version of vec |
NOTE: | Assumes the input is real-valued. |
Fix non-finite values in a numpy array, and replace them with repval.
INPUT: | arr – numpy array, with values to be replaced. repval – value to replace non-finite elements with |
---|---|
OPTIONAL INPUT: | retarr – if False, changes values in arr directly (more efficient). if True, returns a fixed copy of the input array, which is left unchanged. |
Example: | fixval(arr, -1)
|
Minimize a function using the downhill simplex algorithm – now with KEYWORDS.
Parameters: |
|
---|---|
Returns: | (xopt, {fopt, iter, funcalls, warnflag})
|
Other Parameters:
- xtol : float
- Relative error in xopt acceptable for convergence.
- ftol : number
- Relative error in func(xopt) acceptable for convergence.
- maxiter : int
- Maximum number of iterations to perform.
- maxfun : number
- Maximum number of function evaluations to make [200*len(x0)]
- full_output : bool
- Set to True if fval and warnflag outputs are desired.
- disp : bool
- Set to True to print convergence messages.
- retall : bool
- Set to True to return list of solutions at each iteration.
- zdelt : number
- Set the initial stepsize for x0[k] equal to zero
- nonzdelt : number
- Set the initial stepsize for x0[k] nonzero
- holdfixed : sequence
- Indices of x0 to hold fixed (e.g., [1, 2, 4])
TBD: | gprior : tuple or sequence of tuples Set a gaussian prior on the indicated parameter, such that chisq += ((x0[p] - val)/unc_val)**2, where the parameters are defined by the tuple gprior=(param, val, unc_val) |
---|---|
Notes: | Uses a Nelder-Mead simplex algorithm to find the minimum of function of one or more variables. |
Allows me to wrap fmin() within pool.map() for multithreading.
EXAMPLE: |
|
---|---|
NOTES: | This must be a separate, stand-alone function in order to be ‘pickleable’, which is required by pool.map(). |
Minimize a function using modified Powell’s method – now with KEYWORDS.
Parameters: |
|
---|---|
Returns: | (xopt, {fopt, xi, direc, iter, funcalls, warnflag}, {allvecs})
|
Other Parameters:
- xtol : float
- Line-search error tolerance.
- ftol : float
- Relative error in func(xopt) acceptable for convergence.
- maxiter : int
- Maximum number of iterations to perform.
- maxfun : int
- Maximum number of function evaluations to make.
- full_output : bool
- If True, fopt, xi, direc, iter, funcalls, and warnflag are returned.
- disp : bool
- If True, print convergence messages.
- retall : bool
- If True, return a list of the solution at each iteration.
Notes: | Uses a modification of Powell’s method to find the minimum of a function of N variables. |
---|
Compute a gaussian distribution at the points x.
p is a three- or four-component array, list, or tuple:
y = [p3 +] p0/(p1*sqrt(2pi)) * exp(-(x-p2)**2 / (2*p1**2))
p[0] – Area of the gaussian p[1] – one-sigma dispersion p[2] – central offset (mean location) p[3] – optional constant, vertical offset
NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1
SEE ALSO: egaussian()
Compute a 2D gaussian distribution at the points x, y.
INPUTS: |
|
---|---|
SEE_ALSO: | gaussian() (1D) |
Compute a 2D elliptical gaussian distribution at the points x, y.
p is a 5-, 6-, or 7-component sequence, defined as:
p[0] – Amplitude (Area of the function)
p[1] – x-dispersion
p[2] – y-dispersion
p[3] – x-central offset
p[4] – y-central offset
p[5] – optional rotation angle (radians)
p[6] – optional constant, vertical offset
X, Y are gridded data from numpy.meshgrid()
x’ = (x - p[3]) cos p[5] - (y - p[4]) sin p[5]
y’ = (x - p[3]) sin p[5] + (y - p[4]) cos p[5]
U = (x’ / p[1])**2 + (y’ / p[2])**2
z = p[6] + p0/(2*pi*p1*p2) * exp(-U / 2)
SEE ALSO: gaussian2d(), lorentzian2d()
Compute an N-dimensional gaussian distribution at the position x.
mu is the length-N 1D vector of mean positions
cov is the NxN covariance matrix of the multinormal distribution.
SEE ALSO: gaussian(), gaussian2d()
Run a Markov Chain Monte Carlo (Metropolis-Hastings algorithm) on an arbitrary function.
INPUTS: |
|
---|---|
OPTIONAL_INPUTS: | |
|
|
OUTPUTS: |
|
REFERENCES: | Numerical Recipes, 3rd Edition (Section 15.8) Wikipedia |
NOTES: | If you need an efficient MCMC algorithm, you should be using http://danfm.ca/emcee/ |
Return start and end indices for consecutive sequences of integer-valued indices.
Example: | import analysis as an
vec = range(5) +range(10,14) + range(22,39)
starts,ends = an.getblocks(vec)
print zip(starts,ends)
|
---|
Get data for a specified planet.
INPUTS: | (str) – planet name, e.g. “HD 189733 b” |
---|---|
OPTIONAL INPUTS: | |
|
|
EXAMPLE: | p = getobj('55cnce')
p.period ---> 2.81705
|
The attributes of the returned object are many and varied, and can be listed using the ‘dir’ command on the returned object.
This looks up data from the local datafile, which could be out of date.
SEE ALSO: rv()
Get phase of an orbiting planet.
If planet.transit==True, phase is based on the transit time ephemeris. If planet.transit==False, phase is based on the RV ephemeris as
computed by function rveph
Perform gradient-based minimization of a user-specified function.
INPUTS: |
|
---|---|
RETURNS: | (params, metric, n_iter) |
NOTES: | The program attempts to be slightly clever: if the metric decreases by <ftol on one iteration, the code iterates one more time. If the termination criterion is once again met then minimization ends; if not, minimization continues as before. For quicker, smarter routines that do much the same thing, you may want to check out the functions in the scipy.optimize package. |
Generic polynomial evaluator.
x (1D array) – pixel values at which to evaluate C
RETP=False – Return coefficients as well as evaluated poly.
SEE ALSO: poly2cheby()
Show two images; title will only be drawn for the top one.
Load ATRAN atmospheric transmission data file.
t = loadatran(filename, wl=True)
ASCII array where the second column is wavelength and the third is the atmospheric transmission.
(This can also be a list of filenames!)
OPTIONAL INPUT: |
|
---|
if wl==True: returns a 2D array, with columns [lambda, transmission] if wl==False: returns a 1D Numpy array of the transmission
Load a set of reduced spectra into a single data file.
datalist = [‘file1.fits’, ‘file2.fits’] datapath = ‘~/data/’
data = loaddata(datalist, path=datapath, band=1)
The input can also be the name of an IRAF-style file list.
Compute a 2D Lorentzian distribution at the points x, y.
p is a 5-, 6-, or 7–component sequence:
- z = (x-p3) ** 2 / p1 ** 2 + (y-p4) ** 2 / p2 ** 2 [ + (x-p3) * (y-p4) * p5 ]
- lorentz = p0 / (1.0 + z) [ + p6]
p[0] – Amplitude (Area of the function) p[1] – x-dispersion p[2] – y-dispersion p[3] – x-central offset p[4] – y-central offset p[5] – optional ellipticitity parameter p[6] – optional constant, vertical offset
SEE ALSO: gaussian2d()
Do weighted least-squares fitting.
INPUTS: |
|
---|---|
RETURNS: | the tuple of (coef, coeferrs, {cov_matrix}) |
Do weighted least-squares fitting on sparse matrices.
INPUTS: |
#retcov : bool. # If True, also return covariance matrix. |
---|---|
RETURNS: | the tuple of (coef, coeferrs, {cov_matrix}) |
SEE_ALSO: | |
REQUIREMENTS: | SciPy’s sparse module. |
Return the mean of an array after removing outliers.
INPUTS: | x – (array) data set to find mean of |
---|---|
OPTIONAL INPUT: | nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean. |
EXAMPLE: | from numpy import *
from analysis import meanr
x = concatenate((randn(200),[1000]))
print mean(x), meanr(x, nsigma=3)
x = concatenate((x,[nan,inf]))
print mean(x), meanr(x, nsigma=3)
|
For now, we assume FILTERSIZE is odd, and that DATA is square!
filt = medianfilter(data, filtersize)
Note that filtersize can be a scalar (e.g., 5) to equally median-filter along both axes, or a 2-vector (e.g., [5, 1]) to apply a rectangular median-filter.
This is about the slowest way to do it, but it was easy to write.
Return the median of an array after removing outliers.
INPUTS: | x – (array) data set to find median of |
---|---|
OPTIONAL INPUT: | nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean. |
EXAMPLE: | from numpy import *
from analysis import medianr
x = concatenate((randn(200),[1000]))
print median(x), medianr(x, nsigma=3)
x = concatenate((x,[nan,inf]))
print median(x), medianr(x, nsigma=3)
|
Convert Julian Date to Modified Julian Date, or vice versa.
if date>=2400000.5, add 2400000.5 if date<2400000.5, subtract 2400000.5
Compute the sum of N gaussian distributions at the points x. The distributions have central moments defined by the vector mu.
Useful for fitting to partially blended spectral data when you have good measurements of positions (i.e., from 2D tracing).
p is a sequence of length (2N+1). If N=2:
p[0] – Area of gaussian 1 p[1] – one-sigma dispersion of gaussian 1 p[2] – Area of gaussian 2 p[3] – one-sigma dispersion of gaussian 2
... etc.
mu1 – central offset (mean location) of gaussian A mu2 – central offset (mean location) of gaussian B
NOTE: FWHM = 2*sqrt(2*ln(2)) * p1 ~ 2.3548*p1
SEE ALSO: doubleGaussian(), gaussian()
Pads input matrix to size specified.
out = pad(in, npix)
out = pad(in, npix_rows, npix_cols); # alternate usage
Written by J. Green @ JPL; converted to Python by I. Crossfield
Helper function for prayerbead(). Not for general use.
Very handy planet object.
Best initialized using getobj().
REQUIREMENTS: | Database file exoplanets.csv from http://exoplanets.org/ |
---|
Compute total transit duration (in days) for a transiting planet.
Using Eq. 14 of J. Winn’s chapter in S. Seager’s book “Exoplanets.”
SEE ALSO: | get_t23() |
---|
Compute full transit duration (in days) for a transiting planet.
Using Eq. 15 of J. Winn’s chapter in S. Seager’s book “Exoplanets.”
SEE ALSO: | get_t14() |
---|
Compute equilibrium temperature.
INPUTS: |
|
---|---|
EXAMPLE: | import analysis
planet = analysis.getobj('HD 189733 b')
planet.get_teq(0.0, 0.25) # zero albedo, full recirculation
|
REFERENCE: |
|
Get phase of an orbiting planet.
refer to function analysis.getorbitalphase for full documentation.
SEE ALSO: analysis.getorbitalphase(), analysis.mjd()
Compute radial velocity on a planet object for given Julian Date.
EXAMPLE: | import analysis
p = analysis.getobj('HD 189733 b')
jd = 2400000.
print p.rv(jd)
|
---|
refer to function analysis.rv for full documentation.
SEE ALSO: analysis.rv(), analysis.mjd()
Compute the most recently elapsed RV emphemeris of a given planet at a given JD. RV ephemeris is defined by the having radial velocity equal to zero.
refer to analysis.rv() for full documentation.
SEE ALSO: analysis.getobj(), analysis.phase()
Write planet object info into a delimited line of text.
INPUTS: | planets : planet object or list thereof filename : str delimiter : str |
---|
Convert straight monomial coefficients to chebychev coefficients.
INPUT: poly coefficients (e.g., for use w/polyval) OUTPUT: chebyt coefficients
SEE ALSO: gpolyval(); scipy.special.chebyt
Matplotlib’s polyfit with weights and sigma-clipping rejection.
DESCRIPTION: | Do a best fit polynomial of order N of y to x. Points whose fit residuals exeed s standard deviations are rejected and the fit is recalculated. Return value is a vector of polynomial coefficients [pk ... p1 p0]. |
---|---|
OPTIONS: |
fev: number of function evaluations to call before stopping ‘diag’nostic flag: Return the tuple (p, chisq, n_iter)
|
REQUIREMENTS: | CARSMath |
NOTES: | Iterates so long as n_newrejections>0 AND n_iter<fev. |
Generic function to perform Prayer-Bead (residual permutation) analysis.
INPUTS: |
OR:
|
---|---|
OPTIONAL INPUTS: | |
|
|
EXAMPLE: | TBW
|
REQUIREMENTS: | kapteyn, Planetary phase curve routines, numpy |
Strip outliers from a dataset, iterating until converged.
INPUT: | data – 1D numpy array. data from which to remove outliers.
|
---|---|
OPTIONAL INPUTS: | |
|
|
EXAMPLE: | from numpy import hist, linspace, randn
from analysis import removeoutliers
data = randn(1000)
hbins = linspace(-5,5,50)
d2 = removeoutliers(data, 1.5, niter=1)
hist(data, hbins)
hist(d2, hbins)
|
Return 2-tuples that are the indices of separate sections, as indicated by breaks in a continuous and always-increasing time series.
INPUTS: |
|
---|---|
EXAMPLE: | import transit
# Simulate a time series with 30-minute sampling:
t1 = np.arange(0, 3.7, 0.5/24)
t2 = np.arange(5, 70, 0.5/24)
t3 = np.arange(70.2, 85, 0.5/24)
days = np.concatenate((t1, t2, t3))
ret = transit.returnSections(days, dtmax=0.1)
# If each segment was correctly identified, these print 'True':
print (t1==days[ret[0][0]:ret[0][1]+1]).all()
print (t2==days[ret[1][0]:ret[1][1]+1]).all()
print (t3==days[ret[2][0]:ret[2][1]+1]).all()
|
Compute unprojected astrocentric RV of a planet for a given JD in m/s.
INPUTS: |
|
---|---|
EXAMPLE: | jd = 2454693 # date: 2008/08/14
p = getobj('55 Cnc e') # planet: 55 Cnc e
vp = rv(p, jd)
returns vp ~ 1.47e5 [m/s] |
The result will need to be multiplied by the sine of the inclination angle (i.e., “sin i”). Positive radial velocities are directed _AWAY_ from the observer.
To compute the barycentric radial velocity of the host star, scale the returned values by the mass ratio -m/(m+M).
Compute the most recently elapsed RV emphemeris of a given planet at a given JD. RV ephemeris is defined by the having radial velocity equal to zero.
EXAMPLE: from analysis import getobj, rveph jd = 2454693 # date: 2008/08/14 p = getobj('55cnce') # planet: 55 Cnc e t = rveph(p, jd)returns t ~
SEE ALSO: getobj(), phase()
Compute radial velocity of a star which has an orbiting planet.
INPUTS: |
|
---|---|
EXAMPLE: |
Positive radial velocities are directed _AWAY_ from the observer. |
SEE_ALSO: |
Run generic_mcmc() and scale the input stepsize to match the desired input acceptance rate.
INPUTS: | mostly the same as for generic_mcmc(), but also with:
|
---|---|
REQUIREMENTS: | pylab (for pylab.interp()) |
for the specified data array/vector along the specified axis.
‘nsigma’ is used to reject outliers.
Output will be a scalar (axis is None) or numpy array, as appropriate.
Matplotlib’s polyfit with sigma-clipping rejection.
Points whose fit residuals exeed s standard deviations are rejected and the fit is recalculated. Returned is a spline object.
Iterates so long as n_newrejections>0 AND n_iter<fev.
OPTIONAL INPUTS: | |
---|---|
err: a set of errors for the data fev: number of function evaluations to call before stopping ‘diag’nostic flag: Return the tuple (p, chisq, n_iter) clip: ‘both’ – remove outliers +/- ‘s’ sigma from fit
|
Compute the standard deviation in a sliding window.
INPUTS: |
|
---|
For now, we assume FILTERSIZE is odd, and that DATA is square!
filt = stdfilt2d(data, filtersize)
This is about the slowest way to do it, but it was easy to write.
Return the standard deviation of an array after removing outliers.
INPUTS: | x – (array) data set to find std of |
---|---|
OPTIONAL INPUT: | nsigma – (float) number of standard deviations for clipping niter – number of iterations. finite – if True, remove all non-finite elements (e.g. Inf, NaN) axis – (int) axis along which to compute the mean. |
EXAMPLE: | from numpy import *
from analysis import stdr
x = concatenate((randn(200),[1000]))
print std(x), stdr(x, nsigma=3)
x = concatenate((x,[nan,inf]))
print std(x), stdr(x, nsigma=3)
|
Compute the standard deviation in the residuals of a data series after average-binning by specified amounts.
INPUTS: |
dataindex - 1D numpy array
|
---|---|
REQUIREMENTS: | Plotting and analysis tools, numpy |
EXAMPLE: | import numpy as np
import analysis as an
import pylab as py
npts = 1e4
t = np.arange(npts)
data = np.random.normal(size=npts)
binfactors = np.arange(1, npts/2.+1)
bindown_result = an.stdres(data, binfactors, dataindex=t, oversamp=1)
py.figure()
py.subplot(211)
py.plot(t, data, 'k')
py.xlabel('Time')
py.ylabel('Data value')
py.minorticks_on()
py.title('Bin-down test: Gaussian Noise')
py.subplot(212)
py.loglog(binfactors, bindown_result, '-b', linewidth=2)
py.loglog(binfactors, data.std()/np.sqrt(binfactors), '--r')
py.xlabel('Binning factor')
py.ylabel('RMS of binned data')
py.legend(['Binned RMS', '1/sqrt(N)'])
|
Test various methods of computing the eccentric anomaly.
ecc = scalar; manom = 1D NumPy array
ecc = 0.15
p_orb = 3.3
mean_anom = 2*pi*linspace(0, p_orb, 10000)/p_orb
an.test_eccentric_anomaly(ecc, mean_anom, tol=1e-10)
Use Singular Value Decomposition to determine the Total Least Squares linear fit to the data. (e.g. http://en.wikipedia.org/wiki/Total_least_squares) data1 - x array data2 - y array
print tells you some information about what fraction of the variance is accounted for
ignore_nans will remove NAN values from BOTH arrays before computing
Generate a line of text for Travis Barman’s planet table.
INPUT: a planet object from getobj().
Calculate (Keplerian, orbital) true anomaly.
One optional input must be given.
INPUT: | ecc – scalar. orbital eccentricity. |
---|---|
OPTIONAL_INPUTS: | |
|
Perform a weighted mean along the specified axis.
INPUTS: |
|
---|---|
SEE ALSO: |
Compute the (weighted) mean in a sliding window.
INPUTS: |
|
---|
Perform a weighted standard deviation along the specified axis. If axis=None, then the weighted standard deviation of the entire array is computed.
Note that this computes the _sample_ standard deviation; Numpy/Scipy computes the _population_ standard deviation, which is greater by a factor sqrt(N/N-1). This effect is small for large datasets.
SEE ALSO: | wmean() |
---|
Taken from http://en.wikipedia.org/wiki/Weighted_standard_deviation