Goal
Write a mathematical model that describes the relationship between two variables and .
Setup
Given observations , , consider a model of the form
where is the random part of the model. The only assumption is that the mean of ’s is . The aim is to find estimates for and .
Comments
- The model assumes that the ’s are known exactly and that the error terms appear only in the ’s.
- Note that is the actual value and that is the predicted value, so is the th residual.
The function to minimize
The residual sum of squares function denoted is
or equivalently
Solving for
Differentiating with respect to gives
and setting to zero yields
where is the centroid of the observations.
Hence the estimate for is
Comments
- Note that the right hand side of equation is equivalent to , so setting to implies that
- Note that this is the assumption that the mean of ’s is . So, in some sense, this assumption follows from the given model.
- When you plot the observations , in the -plane, the line passes through the centroid of the observations.
- It can be useful to think of the centroid as a fixed fulcrum and to think of the line as a lever moving on this fulcrum. To completely determine the line, you’d need to find an estimate for , which is the slope of the line.
Solving for
Differentiating with respect to gives
and setting to zero yields
Using (2) yields
or
or
or, after dividing both numerator and denominator by ,
Note that the numerator of the fraction in the previous expression is and the denominator is , as shown by the following computations.
Therefore
Using and , where is the standard deviation of the ’s, is the standard deviation of the ’s, and is the correlation between the ’s and the ’s, we get
Hence the estimate for is
Comments
- Note that the right hand side of equation is equivalent to , so setting to implies that
Note that the population factor has been used in all formulas (for example, for covariance, standard deviation, etc.).
Note that is proportional to the correlation .
The critical point is a local minimum
In order to conclude that the critical point is a local minimum for , it is sufficient to show that the Jacobian of at is a positive definite matrix.
From (1) it follows that and from (4), it follows that . Therefore the Jacobian of at is the matrix
The matrix is positive definite if and only if two conditions are satisfied: (i) the (1,1) entry of the matrix is positive; and (ii) the determinant of the Jacobian is positive. Condition (i) is satisfied because the (1,1) entry of the Jacobian is , which is positive. Condition (ii) is also satisfied because the determinant is , which is equivalent to , which is positive.
The model
Using (2) and (5), the model can be written as
or
Regression to the mean
From either or , it follows that
Inequality is the essence of the phenomenon that is commonly known as regression to the mean.
Summing up
Mean squared error (MSE)
Sums of squares (residual, explainable, total)
Theorem
The geometric interpretation of the theorem
In , consider the plane spanned by and . Let be the point , which lies on the line generated by in . If is the point and is the point , it follows that is the projection of the point on . The vector from to is perpendicular to the vector from to , so the triangle with vertices , , and is a right triangle with a right angle at . The theorem is equivalent to the Pythagorean theorem applied to the right triangle .
Proof of TSS = RSS + ESSRecall that (equation 3) and (equation 6).
To complete the proof, it’s sufficient to prove that the boxed expression is .
Definition
Theorem
ProofBy definition, .
Therefore, .
Theorem
- Show that the mean of is .
- Use equation (7) to show that the mean of is , which implies that the mean of is .
- Use the bilinearity of covariance and equation (7) to show that .
- Use the bilinearity of covariance to show that .
- Use the definition of correlation in terms of covariance and variance to conclude that .
Reference
Ordinary least squares, https://en.wikipedia.org/wiki/Ordinary_least_squares