vec_math
Class Statistic

java.lang.Object
  extended by vec_math.Statistic

public class Statistic
extends Object

Simple statistics package.


Nested Class Summary
static class Statistic.Column
          Reads the column of the given ascii file and prints out statistic info.
static class Statistic.Gauss
          Test the statistic with a randomly generated normally distibuted statistic.
static class Statistic.QQ
          Q-Q Plot to normal distribution, grabbing mu and standarddev from column
static class Statistic.ResidualComparator
          Sorts values according to their absolute residual from the given average.
(package private) static class Statistic.ValidArray
          Container class to allow return of an array with only the first Statistic.ValidArray.validlength elements being valid.
 
Field Summary
private static double ACCUR
           
private  Class<?> base
          If this statistic was on any other class than double, note this here.
private static double EPSILON
           
private  Vector2D medquart
          The median and the sigma from quartiles.
private  double[] minmax
          The minimum and maximum of the data set.
private  Map<Integer,Double> power
          The sum of the squared difference of the values minus average.
private  double sum
          after first average retrieval, this holds the sum of the values.
private  Statistic.ValidArray val
          The relevant data.
 
Constructor Summary
Statistic()
          Constructs an empty statistic
Statistic(double[] x)
          Constructs a ready-to-use statistic object out of Double objects.
Statistic(int vi, DataCard[] val)
          Constructs a ready-to-use statistic object out of a Datacard list.
Statistic(int vi, List<DataCard> val)
          Constructs a ready-to-use statistic object out of a Datacard list.
Statistic(List<? extends Number> x)
          Constructs a ready-to-use statistic object out of Double objects.
Statistic(List<? extends VectorG> val, int vi)
          Constructs a ready-to-use statistic object out of a VectorG.
Statistic(Number[] x)
          Constructs a ready-to-use statistic object out of Double objects.
Statistic(VectorG[] val, int vi)
          Constructs a ready-to-use statistic object out of a VectorG array.
 
Method Summary
private  Statistic.ValidArray bounds(double av, double lo, double hi)
          For efficiency reason we return an array with the size of the input length, but copy only those values that are within the stated bounds.
 int ccdClip(double rn, double gain, double siglo, double sighi, int keep, boolean useav, int nmax)
          Iterative algorithm for rejecting ADUs in frames with a given read noise and gain factor, assuming the noise is typical CCD noise, i.e.
private  double centralPowerSum(int n)
          Calculates the centralized power sum once the average has been calculated in the firstPass.
 double[] clip(double sigfac)
          Clips the data to return only values that are not more than the started factor of standard deviations off the mean.
 double[] clip(double siglo, double sighi)
          Clips the data to return only values that are not more than the stated factor of standard deviations off the mean, with two different bounds for low and high sigma.
private  Statistic.ValidArray clipIt(double siglo, double sighi)
          For efficiency reason we return an array with the size of the input length, but copy only those values that are within the stated bounds.
private  double firstPass()
          Calcualtaes the first pass values, which are the sum of the data, its minimum and maximum.
static double fTestProb(Statistic s0, Statistic s1)
          The f-test measures the significance of differences in the variances of two sets.
static Statistic gaussian(int n, double av, double sigma)
          Creates a statistic whose values are Gaussian distributed about the given mean with the specified standard deviation.
private static double[] getAsPrimitiveArray(Number[] z)
          Converts an array of Doubles into an array of its primitive type.
 double getAverage()
          Returns the average of all valid data points.
static double getAverage(double[] val)
          Convenience.
 double getCentralMoment(int n)
          Returns the central power sum.
 double[] getData()
          Returns the valid data in the statistic.
 double getKurt()
          Returns the normalized kurtosis, which is zero for a normal distribution.
 double getMax()
          Returns the biggest of the valid values in this statistic.
 double getMedian()
          Returns the median of this data array, but copies it prior to sorting, so the original data is not destoryed.
static double getMedian(double[] a)
          Returns the median of an array.
 double getMin()
          Returns the smallest of the valid values in this statistic.
static Vector2D getMinMax(double[] val)
          Convenience.
 int getN()
          Returns the number of valid data points.
 double getNormalizedCentralMoment(int k)
          Returns the normalized sample moments.
 Class<?> getOriginalBase()
          Get the original class data.
 int getOutliers(double sigfac)
          Returns the number of outliers, that is the number of points that are further of the mean than the stated multiple of standard deviaitons.
 int[] getOutliers(double[] sigfac)
          Returns the number of outliers, that is the number of points that are further of the mean than the stated multiple of standard deviaitons.
 double getQuartileSigma()
          Returns the sigma estimated by quartiles of this data array, but copies it prior to sorting, so the original data is not destoryed.
static double getQuartileSigma(double[] a)
          Returns the median of an array.
 double getSkew()
          Returns the skew.
 double getSpan()
          Returns the difference between x_max and x_min.
 double getStandardDeviation()
          Returns the sqaure root of the unbiased estimate of the variance, normally termed sample standard deviation.
 double getVariance()
          Returns the unbiased estimator of the variance which means it includes Bessel's correction factor of N/(N-1).
static double kolmogorovSmirnovDistance(double[] d, Function cdf)
          Returns the Kolmogorov-Smirnov distance against a known cumulative distribution function.
static double kolmogorovSmirnovNormal(double[] d, double av, double sigma)
          Returns the Kolmogorov-Smirnov distance against the normal distribution with the given average and standard deviation.
static double kolmogorovSmirnovPoisson(double[] d, double av)
          Returns the Kolmogorov-Smirnov distance against the Poisson distribution with the given average.
static double kolmogorovSmirnovProbability(double dist, double[] d)
          Any distance derived from Kolmogorov-Smirnov distances can be converted to a null-hypothesis significance here.
private static double ksProb(double lambda)
          Kolmogorov-Smirnov probability function, Num.
private static Vector2D medianQuartile(double[] a)
          Calculates median and a quartiles-comparable in one shot and returns them as a two dimensional vector, on zeroth index is median, on first index the quartile range.
 int photonClip(double gain, double siglo, double sighi, int keep, boolean useav, int nmax)
          Iterative algorithm for rejection ADUs in frames with are photon-noise dominated.
private  int poissonClip(double rn, double gain, double aduweight, double siglo, double sighi, int keep, boolean useav, int nmax)
          Combine zero and ccd clipping by assuming sigma to scale with the stated factor to the average/median ADUs.
private  int pushBack(int totreject, int keep, double av)
          We have too little values remaining.
 int reject(int nmax, double siglo, double sighi)
          Reject until no more points are rejected or the maximum number of loops has been executed or the total number of points is two.
 int reject(int nmax, double siglo, double sighi, int minretain)
          Reject until no more points are rejected or the maximum number of loops has been executed or the total number of points is minretain, which is forced to be at least two
 void setData(double[] x)
          Sets the values a statistic should be derived from.
private  void setData(Statistic.ValidArray v)
          For internal access, we directly set the valid array and erease power and summing information.
 void setOriginalBase(Class<?> org)
          Sets the marker class that we converted something of this class type to primitive double during statistic construction.
 String toString()
          Prints out the average plus the standard deviation.
static double tTest(Statistic s0, Statistic s1)
          We test two statistics if their means differ significantly by assuming their variance to be alike using student's T-test.
static double tTestCovariant(Statistic s0, Statistic s1)
          For two statistics with paired samples.
static double tTestCovariantProb(Statistic s0, Statistic s1)
          For two statistics with paired samples.
static double tTestProb(Statistic s0, Statistic s1)
          From the value of the student's t-test above, we calculate the significance of the differences of the two means.
static double tTestVariance(Statistic s0, Statistic s1)
          We test two statistics with significantely different variances for significant differences in their mean.
static double tTestVarianceNoF(Statistic s0, Statistic s1)
          We test two statistics with significantely different variances for significant differences in their mean.
static double tTestVarianceProb(Statistic s0, Statistic s1)
           
 int zeroClip(double rn, double gain, double siglo, double sighi, int keep, boolean useav, int nmax)
          Iterative algorithm for rejecting ADUs in frames with a given read noise and gain factor, assuming the noise to be read-out dominated.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

EPSILON

private static final double EPSILON
See Also:
Constant Field Values

ACCUR

private static final double ACCUR
See Also:
Constant Field Values

base

private Class<?> base
If this statistic was on any other class than double, note this here.


val

private Statistic.ValidArray val
The relevant data.


sum

private double sum
after first average retrieval, this holds the sum of the values.


power

private Map<Integer,Double> power
The sum of the squared difference of the values minus average.


minmax

private double[] minmax
The minimum and maximum of the data set.


medquart

private Vector2D medquart
The median and the sigma from quartiles.

Constructor Detail

Statistic

public Statistic()
Constructs an empty statistic


Statistic

public Statistic(double[] x)
Constructs a ready-to-use statistic object out of Double objects. Note that the timly calculations are only done at a call to the appropriate method, e.g. getAverage()

Parameters:
x - The data set.

Statistic

public Statistic(List<? extends Number> x)
Constructs a ready-to-use statistic object out of Double objects. Note that the timly calculations are only done at a call to the appropriate method, e.g. getAverage()

Parameters:
x - The data set.

Statistic

public Statistic(Number[] x)
Constructs a ready-to-use statistic object out of Double objects. Note that the timly calculations are only done at a call to the appropriate method, e.g. getAverage()

Parameters:
x - The data set.

Statistic

public Statistic(List<? extends VectorG> val,
                 int vi)
Constructs a ready-to-use statistic object out of a VectorG. The values are contained in the first index given, while the second one gives the weights or is -1 if no index carries the weights. calculations are only done at a call to the appropriate method, e.g. getAverage()

Parameters:
val - The data set with values and weights at some index.
vi - The index in a single VectorG that has the data.

Statistic

public Statistic(VectorG[] val,
                 int vi)
Constructs a ready-to-use statistic object out of a VectorG array. The values are contained in the first index given, while the second one gives the weights or is -1 if no index carries the weights. calculations are only done at a call to the appropriate method, e.g. getAverage()

Parameters:
val - The data set with values and weights at some index.
vi - The index in a single VectorG that has the data.

Statistic

public Statistic(int vi,
                 List<DataCard> val)
Constructs a ready-to-use statistic object out of a Datacard list. The values are contained in the first index given, while the second one gives the weights or is -1 if no index carries the weights. calculations are only done at a call to the appropriate method, e.g. getAverage()

Parameters:
val - The data set with values and weights at some index.
vi - The index in a single VectorG that has the data.

Statistic

public Statistic(int vi,
                 DataCard[] val)
Constructs a ready-to-use statistic object out of a Datacard list. The values are contained in the first index given, while the second one gives the weights or is -1 if no index carries the weights. calculations are only done at a call to the appropriate method, e.g. getAverage()

Parameters:
val - The data set with values and weights at some index.
vi - The index in a single VectorG that has the data.
Method Detail

getOriginalBase

public Class<?> getOriginalBase()
Get the original class data.


setOriginalBase

public void setOriginalBase(Class<?> org)
Sets the marker class that we converted something of this class type to primitive double during statistic construction.


gaussian

public static Statistic gaussian(int n,
                                 double av,
                                 double sigma)
Creates a statistic whose values are Gaussian distributed about the given mean with the specified standard deviation. Can be used for the various t- , f- etc. tests versus a normal distribution.


setData

public void setData(double[] x)
Sets the values a statistic should be derived from. Assumes all elements valid.

Parameters:
x -

setData

private void setData(Statistic.ValidArray v)
For internal access, we directly set the valid array and erease power and summing information.


getData

public double[] getData()
Returns the valid data in the statistic.


firstPass

private double firstPass()
Calcualtaes the first pass values, which are the sum of the data, its minimum and maximum.


centralPowerSum

private double centralPowerSum(int n)
Calculates the centralized power sum once the average has been calculated in the firstPass. Updates the lookup table.


getMin

public double getMin()
Returns the smallest of the valid values in this statistic.


getMax

public double getMax()
Returns the biggest of the valid values in this statistic.


getSpan

public double getSpan()
Returns the difference between x_max and x_min.


getN

public int getN()
Returns the number of valid data points.


getAverage

public double getAverage()
Returns the average of all valid data points. This is an unbiased estimator for the population mean this sample has been drawn from.

Returns:
double

getAverage

public static double getAverage(double[] val)
Convenience. Assumes all data valid.


medianQuartile

private static Vector2D medianQuartile(double[] a)
Calculates median and a quartiles-comparable in one shot and returns them as a two dimensional vector, on zeroth index is median, on first index the quartile range. It is not truely the quartile range, but we return a range here that would equal the standard deviation if the input array is normally distributed.


getMedian

public static double getMedian(double[] a)
Returns the median of an array. Array is destroyed (resorted).


getQuartileSigma

public static double getQuartileSigma(double[] a)
Returns the median of an array. Array is destroyed (resorted).


getMedian

public double getMedian()
Returns the median of this data array, but copies it prior to sorting, so the original data is not destoryed.


getQuartileSigma

public double getQuartileSigma()
Returns the sigma estimated by quartiles of this data array, but copies it prior to sorting, so the original data is not destoryed.


getMinMax

public static Vector2D getMinMax(double[] val)
Convenience.


getVariance

public double getVariance()
Returns the unbiased estimator of the variance which means it includes Bessel's correction factor of N/(N-1). Note that, though this is an unbiased estimator for the population variance, taking the square root of that is not an unbiased estimator for the population standard deviation, it tends to overestimate the standard deviation.


getStandardDeviation

public double getStandardDeviation()
Returns the sqaure root of the unbiased estimate of the variance, normally termed sample standard deviation. This is not an unbiased estimator for the population standard deviation, it tends to overestimate it.


getSkew

public double getSkew()
Returns the skew.


getKurt

public double getKurt()
Returns the normalized kurtosis, which is zero for a normal distribution.


getCentralMoment

public double getCentralMoment(int n)
Returns the central power sum.


getNormalizedCentralMoment

public double getNormalizedCentralMoment(int k)
Returns the normalized sample moments.


getOutliers

public int getOutliers(double sigfac)
Returns the number of outliers, that is the number of points that are further of the mean than the stated multiple of standard deviaitons.


getOutliers

public int[] getOutliers(double[] sigfac)
Returns the number of outliers, that is the number of points that are further of the mean than the stated multiple of standard deviaitons.


clip

public double[] clip(double sigfac)
Clips the data to return only values that are not more than the started factor of standard deviations off the mean. The original array this statistic was constructed with is unaltered.


clip

public double[] clip(double siglo,
                     double sighi)
Clips the data to return only values that are not more than the stated factor of standard deviations off the mean, with two different bounds for low and high sigma. The original array this statistic was constructed with is unaltered.


toString

public String toString()
Prints out the average plus the standard deviation.

Overrides:
toString in class Object

clipIt

private Statistic.ValidArray clipIt(double siglo,
                                    double sighi)
For efficiency reason we return an array with the size of the input length, but copy only those values that are within the stated bounds. The number of copies elements is the number of valid array elements.

Use this in re-entrand versions.


bounds

private Statistic.ValidArray bounds(double av,
                                    double lo,
                                    double hi)
For efficiency reason we return an array with the size of the input length, but copy only those values that are within the stated bounds. The number of copies elements is the number of valid array elements.

Use this in re-entrand versions.


reject

public int reject(int nmax,
                  double siglo,
                  double sighi)
Reject until no more points are rejected or the maximum number of loops has been executed or the total number of points is two.


reject

public int reject(int nmax,
                  double siglo,
                  double sighi,
                  int minretain)
Reject until no more points are rejected or the maximum number of loops has been executed or the total number of points is minretain, which is forced to be at least two


zeroClip

public int zeroClip(double rn,
                    double gain,
                    double siglo,
                    double sighi,
                    int keep,
                    boolean useav,
                    int nmax)
Iterative algorithm for rejecting ADUs in frames with a given read noise and gain factor, assuming the noise to be read-out dominated. This is normally used for bias frames. From the stated read-noise and gain we calculate the expected variance in ADUs via σ&pow2;=(RN/gain)&pow2;, all values below siglow*sigma or above sighi*sigma around the average/median are rejected. In each iteration, the average/median is recalculated, but sigma is kept fixed.


ccdClip

public int ccdClip(double rn,
                   double gain,
                   double siglo,
                   double sighi,
                   int keep,
                   boolean useav,
                   int nmax)
Iterative algorithm for rejecting ADUs in frames with a given read noise and gain factor, assuming the noise is typical CCD noise, i.e. a read-out noise plus Poisson noise from the ADU intensity. From the stated read-noise and gain we calculate the expected variance in ADUs via σ&pow2;=(RN/gain)&pow2;+ADU_0/gain, all values below siglow*sigma or above sighi*sigma around the average/median (ADU_0) are rejected. In each iteration, the average/median and thus sigma is recalculated.

Parameters:
rn - Read nois in electrons
gain - Per electron we 'gain' this number of ADUs. From ADUs, we get the number of electrons by division through the gain.

photonClip

public int photonClip(double gain,
                      double siglo,
                      double sighi,
                      int keep,
                      boolean useav,
                      int nmax)
Iterative algorithm for rejection ADUs in frames with are photon-noise dominated. The gain supplied is for the IRAF avsigclip algorithm an estimated gain-per-line, which can be computed in FitsStatistic. This gain is fixed for a line, and the sigma in the photon-noise limiting scenario is σ&pow2;=ADU_0/gain, where again ADU_0 is the average/median of the ADU on x/y of all the images. In each iteration, the average/median and thus sigma is recalculated, but the gain stays constant.


poissonClip

private int poissonClip(double rn,
                        double gain,
                        double aduweight,
                        double siglo,
                        double sighi,
                        int keep,
                        boolean useav,
                        int nmax)
Combine zero and ccd clipping by assuming sigma to scale with the stated factor to the average/median ADUs.


pushBack

private int pushBack(int totreject,
                     int keep,
                     double av)
We have too little values remaining. We push-back those with the lowest absolute residual, until we have at least keep values remaining. We continue pushing back values with identical absoulte residual to the last one added.


kolmogorovSmirnovDistance

public static double kolmogorovSmirnovDistance(double[] d,
                                               Function cdf)
Returns the Kolmogorov-Smirnov distance against a known cumulative distribution function. Input array is not destroyed.

Parameters:
d - The sample distribution
cdf - The cumulative distribution function, e.g. 0.5*(1+erf(x-μ)/sqrt(2σ&pow2;)) for the normal distribution

kolmogorovSmirnovNormal

public static double kolmogorovSmirnovNormal(double[] d,
                                             double av,
                                             double sigma)
Returns the Kolmogorov-Smirnov distance against the normal distribution with the given average and standard deviation. Input array is not destroyed.The cumulative distribution function is 0.5*(1+erf((x-μ)/sqrt(2σ&pow2;))) for the normal distribution's avaerage μ and standard deviation &simga;

Parameters:
d - The sample distribution.

kolmogorovSmirnovPoisson

public static double kolmogorovSmirnovPoisson(double[] d,
                                              double av)
Returns the Kolmogorov-Smirnov distance against the Poisson distribution with the given average. Input array is not destroyed. The cumulative distribution function is Γ(Math.floor(t+1),av)/Math.floor(t)!.

Parameters:
d - The sample distribution.

kolmogorovSmirnovProbability

public static double kolmogorovSmirnovProbability(double dist,
                                                  double[] d)
Any distance derived from Kolmogorov-Smirnov distances can be converted to a null-hypothesis significance here. We return the probability that a distance as high or higher as the argument may be encountered, if our sample was drawn from the underlying statistic.


ksProb

private static double ksProb(double lambda)
Kolmogorov-Smirnov probability function, Num. Rec. p626. To get from a KS-distance to a probability p that a sample distance is higher than the observed one, call this method with (0.12+sqrt(n)+0.11/sqrt(n))*d.
 ks(λ)=2Σ_i (-1)^(i-1)*exp(-2i&pow2;λ&pow2;)
 


tTest

public static double tTest(Statistic s0,
                           Statistic s1)
We test two statistics if their means differ significantly by assuming their variance to be alike using student's T-test. To calculate a probability out of this t-value, use Gamma.betai(1/2*(n1+n2-2),0.5,(n1+n2-2)/(n1+n2-2+t*t))


tTestProb

public static double tTestProb(Statistic s0,
                               Statistic s1)
From the value of the student's t-test above, we calculate the significance of the differences of the two means. Small numbers (0.05 and less) indicate that the difference in the means of the two statistics is very significant, thus ruling out that the two statistics have the same mean.


tTestVariance

public static double tTestVariance(Statistic s0,
                                   Statistic s1)
We test two statistics with significantely different variances for significant differences in their mean. The assumed number of freedoms can be estimated with tTestVarianceNoF(vec_math.Statistic, vec_math.Statistic), which returns a double, which can be fed into Gamma.betai(1/2*NoF,0.5,NoF/(NoF+t*t))


tTestVarianceNoF

public static double tTestVarianceNoF(Statistic s0,
                                      Statistic s1)
We test two statistics with significantely different variances for significant differences in their mean. The assumed number of freedoms can be estimated with tTestVarianceNoF(vec_math.Statistic, vec_math.Statistic), which returns a double, which can be fed into Gamma.betai(1/2*NoF,0.5,NoF/(NoF+t*t))


tTestVarianceProb

public static double tTestVarianceProb(Statistic s0,
                                       Statistic s1)

tTestCovariant

public static double tTestCovariant(Statistic s0,
                                    Statistic s1)
For two statistics with paired samples. The number of freedoms here is the number of pairs (=s0.getN())-1.


tTestCovariantProb

public static double tTestCovariantProb(Statistic s0,
                                        Statistic s1)
For two statistics with paired samples. The number of freedoms here is the number of pairs (=s0.getN())-1, we return Gamma.betai(1/2*NoF,0.5,NoF/(NoF+t*t))


fTestProb

public static double fTestProb(Statistic s0,
                               Statistic s1)
The f-test measures the significance of differences in the variances of two sets. As it tests the ratio of the variances, either values f >> 1 or f << 1 indicate significancant differences.


getAsPrimitiveArray

private static final double[] getAsPrimitiveArray(Number[] z)
Converts an array of Doubles into an array of its primitive type.