This may be a silly question, but I would really appreciate a serious answer. The disturbance matrix can be constructed by multiplying the disturbance vector with its transpose. The diagonal then contains variances, the off-diagonal elements are covariances between pairs of observations (right?). My question is: Why do we get covariances this way (when its formula is more demanding)? Is it because the expected value of the disturbance term is 0, so that it already contains deviance scores (where the mean is subtracted)? I am inclined to think so, but I don’t like the fact that the means that are subtracted are based on all the observations, whereas the “covariances” in the matrix just concern single pairs of them. Am I missing something? Is it ok to base the expected values used in the covariance formula on more observations than are included in the calculation of specific entries? |