Hansen-Hurwitz estimator examples
(2 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
− | {{ | + | {{Ficontent}} |
− | + | __TOC__ | |
==Example 1== | ==Example 1== | ||
− | + | ||
− | + | ||
If we would be able to determine the selection probabilities <math>p_i</math> such that they were strictly proportional to the values <math>y_i</math> of the target variable. Then, the true parametric total would be estimated perfectly by each single observation <math>y_i</math><ref name="kleinn2007">Kleinn, C. 2007. Lecture Notes for the Teaching Module Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology, Georg-August-Universität Göttingen. 164 S.</ref>. | If we would be able to determine the selection probabilities <math>p_i</math> such that they were strictly proportional to the values <math>y_i</math> of the target variable. Then, the true parametric total would be estimated perfectly by each single observation <math>y_i</math><ref name="kleinn2007">Kleinn, C. 2007. Lecture Notes for the Teaching Module Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology, Georg-August-Universität Göttingen. 164 S.</ref>. | ||
− | + | While this is certainly desirable, it is impossible because calculation of selection probabilities exactly proportional to the size of the target variable requires knowing the entire population. Then, there is no point in sampling. We, obviously, need to find an ancillary size variable of which we know that the target variable is as highly correlated as possible. | |
− | While this is certainly desirable, it is impossible because calculation of selection probabilities exactly proportional to the size of the target variable requires knowing the entire population. Then, there is no point in sampling. We, obviously, need to find an ancillary size variable of which we know that the target variable is as highly correlated as possible. | + | |
− | + | ||
==Example 2== | ==Example 2== | ||
− | |||
Applying the Hansen Hurwitz estimator to the example population (Figure 1) with probabilities proportional to the plot area ''x'', with replacement, we build a list as given in Figure 2. The parametric variances, with probabilities proportional to strip-plot area, for the estimated total and the estimated mean, respectively, are as follows: | Applying the Hansen Hurwitz estimator to the example population (Figure 1) with probabilities proportional to the plot area ''x'', with replacement, we build a list as given in Figure 2. The parametric variances, with probabilities proportional to strip-plot area, for the estimated total and the estimated mean, respectively, are as follows: | ||
Line 117: | Line 113: | ||
<blockquote> | <blockquote> | ||
− | {| | + | {| class="wikitable" |
|- | |- | ||
! width="200pt" align="center" | Number | ! width="200pt" align="center" | Number | ||
Line 123: | Line 119: | ||
! width="200pt" align="center" | x | ! width="200pt" align="center" | x | ||
! width="200pt" align="center" | Selection probability (p<sub>i</sub>) | ! width="200pt" align="center" | Selection probability (p<sub>i</sub>) | ||
− | ! width="200pt" align="center" | < | + | ! width="200pt" align="center" | p<sub>i</sub> * (y<sub>i</sub>/p<sub>i</sub> -212)² |
|- | |- | ||
| width="200pt" align="center" | 1 | | width="200pt" align="center" | 1 | ||
Line 185: | Line 181: | ||
</blockquote> | </blockquote> | ||
− | |||
− | |||
− | |||
For a sample of size ''n'' = 10 the parametric variance for sampling with (intelligently defined) unequal probabilities can, as this example illustrates, lead to a tremendous gain in precision. In comparison with simple random sampling (<math>var(\bar y) = 0.49</math>) the unequal probability sampling is 3.5 times more precise! | For a sample of size ''n'' = 10 the parametric variance for sampling with (intelligently defined) unequal probabilities can, as this example illustrates, lead to a tremendous gain in precision. In comparison with simple random sampling (<math>var(\bar y) = 0.49</math>) the unequal probability sampling is 3.5 times more precise! | ||
Line 193: | Line 186: | ||
Observe: Both the Hansen Hurwitz estimator and ratio estimator are based on the knowledge of the area of the unequally sized sample strips. Unequal probability sampling, however, requires that all strip areas are known before sampling, because for each strip, the selection probability must be known a priori. The ratio estimator simply requires that the ancillary variable “strip-plot area” is being observed on the sampled plots. | Observe: Both the Hansen Hurwitz estimator and ratio estimator are based on the knowledge of the area of the unequally sized sample strips. Unequal probability sampling, however, requires that all strip areas are known before sampling, because for each strip, the selection probability must be known a priori. The ratio estimator simply requires that the ancillary variable “strip-plot area” is being observed on the sampled plots. | ||
− | |||
==References== | ==References== |
Latest revision as of 13:00, 26 October 2013
Contents |
[edit] Example 1
If we would be able to determine the selection probabilities \(p_i\) such that they were strictly proportional to the values \(y_i\) of the target variable. Then, the true parametric total would be estimated perfectly by each single observation \(y_i\)[1].
While this is certainly desirable, it is impossible because calculation of selection probabilities exactly proportional to the size of the target variable requires knowing the entire population. Then, there is no point in sampling. We, obviously, need to find an ancillary size variable of which we know that the target variable is as highly correlated as possible.
[edit] Example 2
Applying the Hansen Hurwitz estimator to the example population (Figure 1) with probabilities proportional to the plot area x, with replacement, we build a list as given in Figure 2. The parametric variances, with probabilities proportional to strip-plot area, for the estimated total and the estimated mean, respectively, are as follows:
- \[var(\hat \tau) = \frac {1}{10} \sum_{i=1}^N p_i \left ( \frac {y_i}{p_i} - 212 \right )^2 = 118.97\,\] and
- \[var(\bar y) = var (\hat \tau)/N^2 = 118.97/900 = 0.13129\]
Number y x 1 2 50 2 3 50 3 6 100 4 5 100 5 6 125 6 8 130 7 6 130 8 7 140 9 8 140 10 6 130 11 7 140 12 7 150 13 9 160 14 8 170 15 10 180 16 9 200 Figure 1. Sample population
Number y x 17 12 210 18 8 210 19 14 210 20 7 200 21 12 200 22 9 180 23 8 160 24 6 140 25 7 120 26 4 90 27 5 90 28 6 100 29 4 100 30 3 80 Mean 7.0667 13950 Pop. variance 7.1289 2087.25
Number y x Selection probability (pi) pi * (yi/pi -212)² 1 2 50 0.011947 23.765352 2 3 50 0.011947 18.265352 3 6 100 0.023895 36.530705 [...] 29 4 100 0.023895 47.530705 30 3 80 0.019116 57.957064 Mean 7.0667 139.50 Pop. Variance 7.1289 2087.25 N 30 Sum 212 4185 1.000000 1189.685579 Figure 2. List sampling applied to the example population of 30 elements.
For a sample of size n = 10 the parametric variance for sampling with (intelligently defined) unequal probabilities can, as this example illustrates, lead to a tremendous gain in precision. In comparison with simple random sampling (\(var(\bar y) = 0.49\)) the unequal probability sampling is 3.5 times more precise!
Observe: Both the Hansen Hurwitz estimator and ratio estimator are based on the knowledge of the area of the unequally sized sample strips. Unequal probability sampling, however, requires that all strip areas are known before sampling, because for each strip, the selection probability must be known a priori. The ratio estimator simply requires that the ancillary variable “strip-plot area” is being observed on the sampled plots.
[edit] References
- ↑ Kleinn, C. 2007. Lecture Notes for the Teaching Module Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology, Georg-August-Universität Göttingen. 164 S.