Hansen-Hurwitz estimator examples
Example 1
If we would be able to determine the selection probabilities \(p_i\) such that they were strictly proportional to the values \(y_i\) of the target variable.
Then, the true parametric total would be estimated perfectly by each single observation \(y_i\).
While this is certainly desirable, it is impossible because calculation of selection probabilities exactly proportional to the size of the target variable requires knowing the entire population. Then, there is no point in sampling. We, obviously, need to find an ancillary size variable of which we know that the target variable is as highly correlated as possible.
Example 2
Applying the Hansen Hurwitz estimator to the example population (Figure 1) with probabilities proportional to the plot area x, with replacement, we build a list as given in Figure 2. The parametric variances, with probabilities proportional to strip-plot area, for the estimated total and the estimated mean, respectively, are as follows:
- \[var(\hat \tau) = \frac {1}{10} \sum_{i=1}^N p_i \left ( \frac {y_i}{p_i} - 212 \right )^2 = 118.97\,\] and
- \[var(\bar y) = var (\hat \tau)/N^2 = 118.97/900 = 0.13129\]
Number y x 1 2 50 2 3 50 3 6 100 4 5 100 5 6 125 6 8 130 7 6 130 8 7 140 9 8 140 10 6 130 11 7 140 12 7 150 13 9 160 14 8 170 15 10 180 16 9 200 Figure 1. Sample population
Number y x 17 12 210 18 8 210 19 14 210 20 7 200 21 12 200 22 9 180 23 8 160 24 6 140 25 7 120 26 4 90 27 5 90 28 6 100 29 4 100 30 3 80 Mean 7.0667 13950 Pop. variance 7.1289 2087.25
Number y x Selection probability (pi) \(p_i * (\frac {y_i}{p_i} - 212)^2\) 1 2 50 0.011947 23.765352 2 3 50 0.011947 18.265352 3 6 100 0.023895 36.530705 [...] 29 4 100 0.023895 47.530705 30 3 80 0.019116 57.957064 Mean 7.0667 139.50 Pop. Variance 7.1289 2087.25 N 30 Sum 212 4185 1.000000 1189.685579 Figure 2. List sampling applied to the example population of 30 elements.
For a sample of size n = 10 the parametric variance for sampling with (intelligently defined) unequal probabilities can, as this example illustrates, lead to a tremendous gain in precision. In comparison with simple random sampling (\(var(\bar y) = 0.49\)) the unequal probability sampling is 3.5 times more precise!
Observe: Both the Hansen Hurwitz estimator and ratio estimator are based on the knowledge of the area of the unequally sized sample strips. Unequal probability sampling, however, requires that all strip areas are known before sampling, because for each strip, the selection probability must be known a priori. The ratio estimator simply requires that the ancillary variable “strip-plot area” is being observed on the sampled plots.