Class AbstractMultipleLinearRegression

java.lang.Object
org.hipparchus.stat.regression.AbstractMultipleLinearRegression
All Implemented Interfaces:
MultipleLinearRegression
Direct Known Subclasses:
GLSMultipleLinearRegression, OLSMultipleLinearRegression

public abstract class AbstractMultipleLinearRegression extends Object implements MultipleLinearRegression
Abstract base class for implementations of MultipleLinearRegression.
  • Constructor Details

    • AbstractMultipleLinearRegression

      public AbstractMultipleLinearRegression()
      Empty constructor.

      This constructor is not strictly necessary, but it prevents spurious javadoc warnings with JDK 18 and later.

      Since:
      3.0
  • Method Details

    • getX

      protected RealMatrix getX()
      Get the X sample data.
      Returns:
      the X sample data.
    • getY

      protected RealVector getY()
      Get the Y sample data.
      Returns:
      the Y sample data.
    • isNoIntercept

      public boolean isNoIntercept()
      Chekc if the model has no intercept term.
      Returns:
      true if the model has no intercept term; false otherwise
    • setNoIntercept

      public void setNoIntercept(boolean noIntercept)
      Set intercept flag.
      Parameters:
      noIntercept - true means the model is to be estimated without an intercept term
    • newSampleData

      public void newSampleData(double[] data, int nobs, int nvars)

      Loads model x and y sample data from a flat input array, overriding any previous sample.

      Assumes that rows are concatenated with y values first in each row. For example, an input data array containing the sequence of values (1, 2, 3, 4, 5, 6, 7, 8, 9) with nobs = 3 and nvars = 2 creates a regression dataset with two independent variables, as below:

         y   x[0]  x[1]
         --------------
         1     2     3
         4     5     6
         7     8     9
       

      Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term. If isNoIntercept() is true, the X matrix will be created without an initial column of "1"s; otherwise this column will be added.

      Throws IllegalArgumentException if any of the following preconditions fail:

      • data cannot be null
      • data.length = nobs * (nvars + 1)
      • nobs > nvars
      Parameters:
      data - input data array
      nobs - number of observations (rows)
      nvars - number of independent variables (columns, not counting y)
      Throws:
      NullArgumentException - if the data array is null
      MathIllegalArgumentException - if the length of the data array is not equal to nobs * (nvars + 1)
      MathIllegalArgumentException - if nobs is less than nvars + 1
    • newYSampleData

      protected void newYSampleData(double[] y)
      Loads new y sample data, overriding any previous data.
      Parameters:
      y - the array representing the y sample
      Throws:
      NullArgumentException - if y is null
      MathIllegalArgumentException - if y is empty
    • newXSampleData

      protected void newXSampleData(double[][] x)

      Loads new x sample data, overriding any previous data.

      The input x array should have one row for each sample observation, with columns corresponding to independent variables. For example, if

        x = new double[][] {{1, 2}, {3, 4}, {5, 6}} 

      then setXSampleData(x) results in a model with two independent variables and 3 observations:

         x[0]  x[1]
         ----------
           1    2
           3    4
           5    6
       

      Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term.

      Parameters:
      x - the rectangular array representing the x sample
      Throws:
      NullArgumentException - if x is null
      MathIllegalArgumentException - if x is empty
      MathIllegalArgumentException - if x is not rectangular
    • validateSampleData

      protected void validateSampleData(double[][] x, double[] y) throws MathIllegalArgumentException
      Validates sample data.

      Checks that

      • Neither x nor y is null or empty;
      • The length (i.e. number of rows) of x equals the length of y
      • x has at least one more row than it has columns (i.e. there is sufficient data to estimate regression coefficients for each of the columns in x plus an intercept.
      Parameters:
      x - the [n,k] array representing the x data
      y - the [n,1] array representing the y data
      Throws:
      NullArgumentException - if x or y is null
      MathIllegalArgumentException - if x and y do not have the same length
      MathIllegalArgumentException - if x or y are zero-length
      MathIllegalArgumentException - if the number of rows of x is not larger than the number of columns + 1 if the model has an intercept; or the number of columns if there is no intercept term
    • validateCovarianceData

      protected void validateCovarianceData(double[][] x, double[][] covariance)
      Validates that the x data and covariance matrix have the same number of rows and that the covariance matrix is square.
      Parameters:
      x - the [n,k] array representing the x sample
      covariance - the [n,n] array representing the covariance matrix
      Throws:
      MathIllegalArgumentException - if the number of rows in x is not equal to the number of rows in covariance
      MathIllegalArgumentException - if the covariance matrix is not square
    • estimateRegressionParameters

      public double[] estimateRegressionParameters()
      Estimates the regression parameters b.
      Specified by:
      estimateRegressionParameters in interface MultipleLinearRegression
      Returns:
      The [k,1] array representing b
    • estimateResiduals

      public double[] estimateResiduals()
      Estimates the residuals, ie u = y - X*b.
      Specified by:
      estimateResiduals in interface MultipleLinearRegression
      Returns:
      The [n,1] array representing the residuals
    • estimateRegressionParametersVariance

      public double[][] estimateRegressionParametersVariance()
      Estimates the variance of the regression parameters, ie Var(b).
      Specified by:
      estimateRegressionParametersVariance in interface MultipleLinearRegression
      Returns:
      The [k,k] array representing the variance of b
    • estimateRegressionParametersStandardErrors

      public double[] estimateRegressionParametersStandardErrors()
      Returns the standard errors of the regression parameters.
      Specified by:
      estimateRegressionParametersStandardErrors in interface MultipleLinearRegression
      Returns:
      standard errors of estimated regression parameters
    • estimateRegressandVariance

      public double estimateRegressandVariance()
      Returns the variance of the regressand, ie Var(y).
      Specified by:
      estimateRegressandVariance in interface MultipleLinearRegression
      Returns:
      The double representing the variance of y
    • estimateErrorVariance

      public double estimateErrorVariance()
      Estimates the variance of the error.
      Returns:
      estimate of the error variance
    • estimateRegressionStandardError

      public double estimateRegressionStandardError()
      Estimates the standard error of the regression.
      Returns:
      regression standard error
    • calculateBeta

      protected abstract RealVector calculateBeta()
      Calculates the beta of multiple linear regression in matrix notation.
      Returns:
      beta
    • calculateBetaVariance

      protected abstract RealMatrix calculateBetaVariance()
      Calculates the beta variance of multiple linear regression in matrix notation.
      Returns:
      beta variance
    • calculateYVariance

      protected double calculateYVariance()
      Calculates the variance of the y values.
      Returns:
      Y variance
    • calculateErrorVariance

      protected double calculateErrorVariance()

      Calculates the variance of the error term.

      Uses the formula
       var(u) = u · u / (n - k)
       
      where n and k are the row and column dimensions of the design matrix X.
      Returns:
      error variance estimate
    • calculateResiduals

      protected RealVector calculateResiduals()
      Calculates the residuals of multiple linear regression in matrix notation.
       u = y - X * b
       
      Returns:
      The residuals [n,1] matrix