vec_math
Class GeneralLinearRegression

java.lang.Object
  extended by vec_math.GeneralLinearRegression

public class GeneralLinearRegression
extends Object

Base class for all general linear least square fits. It fits data points yi to their dependend variable xi with a general linear model, say

                 y(x) = ∑akXk(x),
    
where Xk(x) may be any function of x. Provides methods to derive linear parameters out of the design matrix. Non-abstract subclasses must provide Matrix getBasicFunctions. The basic functions is a matrix mij consiting of the function Xj evaluated on the abscisse values xi. Out of these the design Matrix is calculated via Aij = Xj(xi)/sig_i, with Xj(xi) the basic functions. Overriding the deriveDesignMatrix(Matrix basicfunction) and just setting the getBasicFunctions() to {} is perfectly legal, because only the design matrix is used in further calculations. The design matrix must have more rows than columns. The normal equations are defined via (A(T)A)a=A(T)b, where A is the design matrix, a is the solution vector and b is the measurement vector: yi/sig_i. The private method invertNormal() expects the design matrix to be set, and calculates the inverse of A(T)A. The diagonal elements of that matrix are the squared standard deviation of the solution vector, which in turn can be found by multiplying A(T)b with the inverse matrix. Lit: Num.Rec. p 671ff.

To use this class, one must set the base functions and the measurement, preferrable with their errors. This is the only constructor that this class provides. Then, one can readily call getSolution() etc.

From version 1.1, we also support vector-functions, i.e. Multidimensional. Here, you must call the constructor providing an array of Multidimensional functions instead of plain Functions, and the measurement times are replaced with an array of measurement dependencies. Note the babel of dimensions involved here: The array length of the Multidimensional is the number of parameters fitted. The length of the VectorG array giving the measurement dependencies is the number of parameters needed to fit a single measurement, e.g. if you have a model like

    yi=yi(ti, xi),
    
then your length of the input VectorG array has to be two. The number of individual measurements that are to be fitted is the VectorG.dimension() number of both, the input measurement dependencies and the VectorG.dimension() of the measurement vector.


Nested Class Summary
static class GeneralLinearRegression.PseudoLine
          A class to write a file with three columns, first being x between zero and one, y being a fit like y=kx+d+random, tird column being the random number denoted above as measurement error.
static class GeneralLinearRegression.StraightLine
          We fit data to a straight line.
 
Field Summary
private  Matrix design
          The design matrix, X_j(x_i).
private  QuadMatrix invert
          The inverted matrix of A^TċA (design matrix).
private  VectorG linfit
          The fitted parameters.
private  Matrix measure
          The measurement matrix, from the design matrix and the y_i vector.
private  VectorG modelfit
          The model fit to the data.
private  VectorG si
          The errors of the measurements or null if unknown.
private  VectorG sigfit
          The standard deviations of the fitted parameters.
private  VectorG yi
          The measurements, corresponding to the rows of the design matrix.
 
Constructor Summary
GeneralLinearRegression(Function[] fitting, VectorG x, VectorG y, VectorG sigma)
          Constructs a new general linear regression.
GeneralLinearRegression(Multidimensional[] fitting, VectorG[] x, VectorG y, VectorG sigma)
          Constructs a new general linear regression.
 
Method Summary
static VectorG convert(VectorG[] in, int index)
          For convenience to shift between different data formats.
protected  Matrix deriveDesignMatrix(Function[] base, VectorG xi)
          Calculates the design matrix from the basic functions and the measurement times.
protected  Matrix deriveDesignMatrix(Multidimensional[] base, VectorG[] xi)
          Calculates the design matrix from the basic functions and the measurement dependables.
private  Matrix deriveMeasureMatrix(Matrix desmat, VectorG y, VectorG sig)
          Sets the measure Matrix out of yi,si and design.
 double getChiSquare()
          Returns the chi square of the solution.
 QuadMatrix getCovarianceMatrix()
          Returns the covariance matrix.
 int getDataCount()
          Returns the number of data points we fit to the model.
 VectorG getModelFit()
          Returns the model fit to the data.
 int getParameterCount()
          Returns the number of parameters we fit for.
 double getQ()
          Returns the 1-p(N-M/2,chi^2/2) value.
 VectorG getResiduals()
          Returns the residuals.
 double getRms()
          Return the rms-value of the fit.
 VectorG getSigma()
          Returns the vector of the standard deviations of the linear fit parameters.
 VectorG getSolution()
          Returns the vector of the solving linear parameters that minimize chi^2 Sets the LinFit array and validFit.
static GeneralLinearRegression harmonic(VectorG t, VectorG y, VectorG sigma, double omega)
          Creates an offset sine/cosine wave to the data.
private  QuadMatrix invertNormal(Matrix desm, VectorG sig)
          Inverts the normal equations (A(T)A), therefore the design matrix must be valid.
static GeneralLinearRegression line(VectorG x, VectorG y, VectorG sigma)
          Creates a general lineare regression that is a fit to a straight line.
static GeneralLinearRegression parabola(VectorG x, VectorG y, VectorG sigma)
          Creates a general lineare regression that is a fit to a parabola.
static GeneralLinearRegression polynomial(int order, VectorG x, VectorG y, VectorG sigma)
          Creates a general lineare regression that is a fit to a polynominal of certain order.
 boolean setBasicFunctions(Function[] fitting, VectorG t)
          Daugther classes must interfere here.
 boolean setBasicFunctions(Multidimensional[] fitting, VectorG[] t)
          Daugther classes must interfere here.
 void setMeasurements(VectorG meas, VectorG error)
          Sets the measurement vector.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

yi

private VectorG yi
The measurements, corresponding to the rows of the design matrix.


si

private VectorG si
The errors of the measurements or null if unknown.


design

private Matrix design
The design matrix, X_j(x_i).


measure

private Matrix measure
The measurement matrix, from the design matrix and the y_i vector.


invert

private QuadMatrix invert
The inverted matrix of A^TċA (design matrix).


linfit

private VectorG linfit
The fitted parameters.


sigfit

private VectorG sigfit
The standard deviations of the fitted parameters.


modelfit

private VectorG modelfit
The model fit to the data.

Constructor Detail

GeneralLinearRegression

public GeneralLinearRegression(Function[] fitting,
                               VectorG x,
                               VectorG y,
                               VectorG sigma)
Constructs a new general linear regression. We must provide the base functions, i.e. the dependence of the fitted values on x, as well as the measured quantities yi. If the measurement errors are unknown, null can be supplied in the sigma parameter. To make things clear: In the formular:
       y(x) = ∑akXk(x),
       
the Xk(x) are the k functions supplied in base. The points at which we have measurements, xi are in the VectorG xi, with the corresponding measurements in the VectorG yi, precision VectorG si. The dimension of xi, yi, and si (if given) must be equal and higher then the length of the base functions array. The solution derived with getSolution() is an VectorG of dimension base.length, specifying the ak above. An estimation of the errors on the ak can be queried with getSigma().
To fit data to a straight line, supply a base array, where
       base[0] = 1
       base[1] = xi
       
for all xi.

Parameters:
fitting - An array of fitting functions.
x - The points where measures are taken
y - The measurements itself
sigma - An error estimate for the measurement or null.

GeneralLinearRegression

public GeneralLinearRegression(Multidimensional[] fitting,
                               VectorG[] x,
                               VectorG y,
                               VectorG sigma)
Constructs a new general linear regression. We must provide the base functions, which are Multidimensionals in this case. The supplied measurment dependables is thus an array of VectorGs, the array length must equal the dimensionality of the multidimensional functions, not the length of the fitting array (this length equals the dimension of the input vectors.)

Parameters:
fitting - An array of fitting functions.
x - The points where measures are taken
y - The measurements itself
sigma - An error estimate for the measurement or null.
Method Detail

line

public static GeneralLinearRegression line(VectorG x,
                                           VectorG y,
                                           VectorG sigma)
Creates a general lineare regression that is a fit to a straight line.

Parameters:
x - The points where measures are taken
y - The measurements itself
sigma - An error estimate for the measurement or null.

parabola

public static GeneralLinearRegression parabola(VectorG x,
                                               VectorG y,
                                               VectorG sigma)
Creates a general lineare regression that is a fit to a parabola. For best results, rescale x and y to their averages before calling this function (replace x/y with xi-).

Parameters:
x - The points where measures are taken
y - The measurements itself
sigma - An error estimate for the measurement or null.

polynomial

public static GeneralLinearRegression polynomial(int order,
                                                 VectorG x,
                                                 VectorG y,
                                                 VectorG sigma)
Creates a general lineare regression that is a fit to a polynominal of certain order. For best results, rescale x and y to their averages before calling this function (replace x/y with xi-).

Parameters:
order - The order of the polynomial, e.g. 3 for cubic.
x - The points where measures are taken
y - The measurements itself
sigma - An error estimate for the measurement or null.

harmonic

public static GeneralLinearRegression harmonic(VectorG t,
                                               VectorG y,
                                               VectorG sigma,
                                               double omega)
Creates an offset sine/cosine wave to the data. According to
       y = a0+a1*cos(omega*x)+a2*sin(omega*x)
       


convert

public static VectorG convert(VectorG[] in,
                              int index)
For convenience to shift between different data formats. GeneralLinearRegression expects all independent variable measures as a single vector, with a dimension equal to the number of measures. The same rule apply for the dependant variable. Sometimes, data is available as an array of #VectorGs, where the dependant and independant variable are at some distinct index. This method converts between these to data formats.

As an input, the array of all data measured is passed over, along with the index of the variable to extract. Returned is a single VectorG with the dimension equal to the array dimension, whose values are loaded with ret.get(n) = VectorG[n].get(index)


setBasicFunctions

public boolean setBasicFunctions(Function[] fitting,
                                 VectorG t)
Daugther classes must interfere here. This method must receive a matrix that hold the evaluation of the fitting functions X_j at all measurment values x_i. Note that this method should be called after the measurement has been set, otherwise the measurement error is ignored.

Returns:
False, if the design matrix could not be calculated.

setBasicFunctions

public boolean setBasicFunctions(Multidimensional[] fitting,
                                 VectorG[] t)
Daugther classes must interfere here. This method must receive a matrix that hold the evaluation of the fitting functions X_j at all measurment values x_i. Note that this method should be called after the measurement has been set, otherwise the measurement error is ignored.

Returns:
False, if the design matrix could not be calculated.

setMeasurements

public void setMeasurements(VectorG meas,
                            VectorG error)
Sets the measurement vector. If measurement errors are not known, null can be used as the second argument. This method may only be called after the design matrix has been calculated.

Parameters:
meas - The measured data point at values x_i
error - The measurement error.

getParameterCount

public int getParameterCount()
Returns the number of parameters we fit for. This equals the number of columns in the design matrix.


getDataCount

public int getDataCount()
Returns the number of data points we fit to the model. This equals the number of rows in the design matrix.


getSolution

public VectorG getSolution()
Returns the vector of the solving linear parameters that minimize chi^2 Sets the LinFit array and validFit.

Returns:
A vector of the minimizing coefficients.

getSigma

public VectorG getSigma()
Returns the vector of the standard deviations of the linear fit parameters. Sets the array LinSig and validSig if successful.

Returns:
double

getCovarianceMatrix

public QuadMatrix getCovarianceMatrix()
Returns the covariance matrix. The diagonal elements are the squares of the estimated sigmas on the parameters, while off-diagonal elements give a correlation between two parameters. In normal fitting problems, you expect these values to be small.


getModelFit

public VectorG getModelFit()
Returns the model fit to the data.


getResiduals

public VectorG getResiduals()
Returns the residuals. This is the original data minus the model fit.


getRms

public double getRms()
Return the rms-value of the fit. This is the square sum of the residuals devided by the number of data points and from this the square root: x_{\mathrm{rms}} = \sqrt {{1 \over n} \sum_{i=1}^{n} x_i^2} = \sqrt {{x_1^2 + x_2^2 + \cdots + x_n^2} \over n}


getChiSquare

public double getChiSquare()
Returns the chi square of the solution. This is the square sum of the residuals, each residual weighed with the sigma of the measurement.

Returns:
double

getQ

public double getQ()
Returns the 1-p(N-M/2,chi^2/2) value.

Returns:
double

deriveDesignMatrix

protected Matrix deriveDesignMatrix(Function[] base,
                                    VectorG xi)
Calculates the design matrix from the basic functions and the measurement times. If a sigma of the measurement is set later, the design matrix is not changed, but the new sigma is taken care of in calculating the solution.

Parameters:
base - The basic functions.
xi - Time or any single parameter needed for measurement model.
Returns:
The design matrix

deriveDesignMatrix

protected Matrix deriveDesignMatrix(Multidimensional[] base,
                                    VectorG[] xi)
Calculates the design matrix from the basic functions and the measurement dependables. If a sigma of the measurement is set later, the design matrix is not changed, but the new sigma is taken care of in calculating the solution.

Parameters:
base - The basic functions.
xi - The parameter needed for measurement model.
Returns:
The design matrix

deriveMeasureMatrix

private Matrix deriveMeasureMatrix(Matrix desmat,
                                   VectorG y,
                                   VectorG sig)
Sets the measure Matrix out of yi,si and design. If si == null a standard deviation of 1. is adopted for every measurement. The measurements with their error and the design matrix must be set on calling this method.


invertNormal

private QuadMatrix invertNormal(Matrix desm,
                                VectorG sig)
Inverts the normal equations (A(T)A), therefore the design matrix must be valid. If successful, validInvert is set.