Hansen-Hurwitz estimator examples

Revision as of 20:32, 7 January 2011

Example 1

If we would be able to determine the selection probabilities \(p_i\) such that they were strictly proportional to the values \(y_i\) of the target variable.

Then, the true parametric total would be estimated perfectly by each single observation \(y_i\).

While this is certainly desirable, it is impossible because calculation of selection probabilities exactly proportional to the size of the target variable requires knowing the entire population. Then, there is no point in sampling. We, obviously, need to find an ancillary size variable of which we know that the target variable is as highly correlated as possible.

Example 2

Applying the Hansen Hurwitz estimator to the example population (Figure 1) with probabilities proportional to the plot area x, with replacement, we build a list as given in Figure 2. The parametric variances, with probabilities proportional to strip-plot area, for the estimated total and the estimated mean, respectively, are as follows:

\[var(\hat \tau) = \frac {1}{10} \sum_{i=1}^N p_i \left ( \frac {y_i}{p_i} - 212 \right )^2 = 118.97\,\] and

\[var(\bar y) = var (\hat \tau)/N^2 = 118.97/900 = 0.13129\]

Number y x

1 2 50

2 3 50

3 6 100

4 5 100

5 6 125

6 8 130

7 6 130

8 7 140

9 8 140

10 6 130

11 7 140

12 7 150

13 9 160

14 8 170

15 10 180

16 9 200

Figure 1. Sample population

Number y x

17 12 210

18 8 210

19 14 210

20 7 200

21 12 200

22 9 180

23 8 160

24 6 140

25 7 120

26 4 90

27 5 90

28 6 100

29 4 100

30 3 80

Mean 7.0667 13950

Pop. variance 7.1289 2087.25

Number y x Selection probability (p_i) \(p_i * (\frac {y_i}{p_i} - 212)^2\)

1 2 50 0.011947 23.765352

2 3 50 0.011947 18.265352

3 6 100 0.023895 36.530705

[...]

29 4 100 0.023895 47.530705

30 3 80 0.019116 57.957064

Mean 7.0667 139.50

Pop. Variance 7.1289 2087.25

N 30

Sum 212 4185 1.000000 1189.685579

Figure 2. List sampling applied to the example population of 30 elements.

For a sample of size n = 10 the parametric variance for sampling with (intelligently defined) unequal probabilities can, as this example illustrates, lead to a tremendous gain in precision. In comparison with simple random sampling (\(var(\bar y) = 0.49\)) the unequal probability sampling is 3.5 times more precise!

Observe: Both the Hansen Hurwitz estimator and ratio estimator are based on the knowledge of the area of the unequally sized sample strips. Unequal probability sampling, however, requires that all strip areas are known before sampling, because for each strip, the selection probability must be known a priori. The ratio estimator simply requires that the ancillary variable “strip-plot area” is being observed on the sampled plots.

@@ Line 29: / Line 29: @@
 <div style = "float:left; margin-right:4em">
 {| class="wikitable"
-|-
-| align="left" colspan="3" | '''Figure 1.''' Sample population
 |-
 !Number
@@ Line 67: / Line 65: @@
 |-
 |16 ||9 ||200
+|-
+| align="left" colspan="3" | '''Figure 1.''' Sample population
 |}
 </div>
 {| class="wikitable"
@@ Line 186: / Line 185: @@
 </blockquote>
+For a sample of size ''n'' = 10 the parametric variance for sampling with (intelligently defined) unequal probabilities can, as this example illustrates, lead to a tremendous gain in precision. In comparison with simple random sampling (<math>var(\bar y) = 0.49</math>) the unequal probability sampling is 3.5 times more precise!
+Observe: Both the Hansen Hurwitz estimator and ratio estimator are based on the knowledge of the area of the unequally sized sample strips. Unequal probability sampling, however, requires that all strip areas are known before sampling, because for each strip, the selection probability must be known a priori. The ratio estimator simply requires that the ancillary variable “strip-plot area” is being observed on the sampled plots.

Hansen-Hurwitz estimator examples

Revision as of 20:32, 7 January 2011

Example 1

Example 2

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Development

Toolbox

Print/export

Number	y	x
1	2	50
2	3	50
3	6	100
4	5	100
5	6	125
6	8	130
7	6	130
8	7	140
9	8	140
10	6	130
11	7	140
12	7	150
13	9	160
14	8	170
15	10	180
16	9	200
Figure 1. Sample population

Number	y	x
17	12	210
18	8	210
19	14	210
20	7	200
21	12	200
22	9	180
23	8	160
24	6	140
25	7	120
26	4	90
27	5	90
28	6	100
29	4	100
30	3	80
Mean	7.0667	13950
Pop. variance	7.1289	2087.25

Number	y	x	Selection probability (p_i)	\(p_i * (\frac {y_i}{p_i} - 212)^2\)
1	2	50	0.011947	23.765352
2	3	50	0.011947	18.265352
3	6	100	0.023895	36.530705
[...]
29	4	100	0.023895	47.530705
30	3	80	0.019116	57.957064
Mean	7.0667	139.50
Pop. Variance	7.1289	2087.25
N	30
Sum	212	4185	1.000000	1189.685579
Figure 2. List sampling applied to the example population of 30 elements.