Variance issue in systematic sampling
Languages: |
English |
Contents |
Empirical approximation of error variance
Again and again: there is no design-unbiased variance estimator in systematic sampling. If we are interested in the true error variance, the only way is to very often repeat the systematic sample and calculate the variance of all the estimations produced; that is then an empirical approximation to the parametric error variance which is the closer to the unknown true value the larger the number of repetitions is. Of course, this is not a viable approach for practical implementation, but it is something that can be done in computer simulations.
Using SRS estimators
What is most frequently done for variance estimation in systematic sampling is that the simple random sampling framework of estimators is applied. It is clear and known that these estimators are not unbiased for systematic sampling but they yield consistently over-estimations of the true error variance; this positive bias can be considerable. We call this sort of estimation a “conservative estimation”: we know that the true error is less (in many cases much less) than the estimation that has been calculated. An example is presented further down in chapter 6.2.2, where the area estimation by dot grids is presented.
\(s_\bar y^2=\frac{s^2}{n}\)
(essentially, because we do not \(k\) now better …). This, however, is not an unbiased estimator but produces an overestimation of the true error variance.
Random differences method
Numerous approximations had been developed to better approximate the true error variance than with the simple random sampling estimator. Two of the more simple ones are presented here, starting with the so called “random differences method”.
Assume that the elements in the population and also the \(n\) elements that are in the systematic sample have the same expected value. We actually may assume that because we have an unbiased estimator for the mean. If we select (repeatedly) random pairs out of the \(n\) elements of the systematic sample and calculate the difference for each of the pairs, we would expect the expected value of this difference to be zero:
Let \(d\) be \(Y_1-Y_2\), then \(E(d)=\mu=E(Y_1-Y_2)=E(Y_1)-E(Y_2)=0\).
The variance of the difference \(var(\bar d)=var(Y_1-Y_2)\) is then be determined along the rules for linear combinations of random variables as known from developing the estimators for stratified random sampling; as we select each one of the two elements of a pair independently at random, the covariance term below becomes zero:
\(var(\bar d)\) | \(=var(Y_1-Y_2)\,\) |
\(=var(Y_1)+var(Y_2)-2cov(Y_1Y_2)\,\) | |
\(=var(Y_1)+var(Y_2)\,\) | |
\(=2\sigma^2\,\) |
References
sorry: |
This section is still under construction! This article was last modified on 12/23/2010. If you have comments please use the Discussion page or contribute to the article! |