Linear regression parametric Bootstrap | Vose Software

Linear regression parametric Bootstrap

See also: The Bootstrap, Analyzing and using data introduction, The parametric Bootstrap, The non-parametric Bootstrap, VoseNBoot

There are two types of observations for which we can apply linear least squares regression:

  1. We are making random observations of X and Y together

  2. We are testing at different specific values of X to determine the response in Y

For type A, both X and Y are Bootstrapped together in pairs and calculate the regression coefficients with each Bootstrap replicate. For paired bootstrapping, use the ModelRisk function VoseNBootPaired.

For type B, we need to retain the selected nature of the X variable. The random element is now just the response of the Y variable to the X value and it is this random component that should be Bootstrapped: the X values are fixed since they were predetermined rather than resulting from a random sample from a distribution. Assuming that the random variations about the regression line are homoscedastic and that the straight line relationship is correct, the only random variable involved is that producing the variations about the line and so we Bootstrap the residuals about that line. If we know the residuals are Normally distributed, we can use the parametric Bootstrap model, as follows:

1.  Determine Syx - the standard deviation of the residuals about the least-squares regression line for the original data set.

2.     For each of the x-values in the data set, randomly sample from a Normal(, Syx) where  and  are the least squared regression coefficients for the original data set.

3.     Determine the least squares regression coefficients for this Bootstrap sample.

4.     Repeat for B iterations.

Although parametric procedures work quite well, we are using all the assumptions of the linear regression parametric relationship between X and Y model, for which classical statistics has calculated the uncertainty distributions, so it would be better to use the classical statistics formulae that offers exact answers under these conditions.