Residual Sum of Squares

RSS - Error

Residual Sum of Squares is important. The more strict notation is error, not residual because Error is the random variable but residual is a constant after fitted.

f(X)=β0+X1β1+X2β2RSS(β)=(y−Xβ)T(y−Xβ)∂RSS∂β=−2XT(y−Xβ)∂2RSS∂β∂βT=−2XTXf(X)=\beta_0+X_1\beta_1+X_2\beta_2 \\ RSS(\beta)=(\mathbf{y}-\mathbf{X}\beta)^T(\mathbf{y}-\mathbf{X}\beta) \\ \frac{\partial RSS}{\partial \beta}=-2\mathbf{X}^T(\mathbf{y}-\mathbf{X}\beta)\\ \frac{\partial^2 RSS}{\partial \beta \partial \beta^T}=-2\mathbf{X}^T\mathbf{X}
XT(y−Xβ)=0β^=(XTX)−1XTyy^=Xβ^=X(XTX)−1XTy=Hy\mathbf{X}^T(\mathbf{y}-\mathbf{X}\beta)=0\\ \hat{\beta}=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y} \\ \hat{y}=\mathbf{X}\hat{\beta}=\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}=\mathbf{H}\mathbf{y}

👀 Geometrical view

YY is the projection onto the column space of XX. This is because H\mathbf{H} is the projection matrix that has symmetric / idempotent properties. H \mathbf{H} is called as hat matrix (giving yy a hat)

Q

A (Under the condition that β^=β^LS\hat{\beta}=\hat{\beta}^{LS})

What

y\mathbf{y}

Where

Col Space of X\mathbf{X}

How

Projection

ε⊥xi\varepsilon \perp x_i, because ϵ=y−y^\epsilon=y-\hat{y}. If we estimate β\betain other methods with exclusion of LSMLSM method, the form y^=β0+X1β1+X2β2\hat{y}=\beta_0+X_1\beta_1+X_2\beta_2 still remains. y^\hat{y} is interpreted still as the vector on col(X)\mathbf{col(X)}. However, In this case y^\hat{y} is not a projected vector so that the residual and variables are not orthogonal.

Last updated

Was this helpful?