Adaptive cluster sampling examples

From AWF-Wiki
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
{{Languages}}
 
 
{{Content Tree|HEADER=Forest Inventory lecturenotes|NAME=Forest Inventory lecturenotes}}
 
{{Content Tree|HEADER=Forest Inventory lecturenotes|NAME=Forest Inventory lecturenotes}}
  
Line 7: Line 6:
 
The population consists here of 400 cells (plots) with a total of 190 [[target objects]], so that the parametric mean density in terms of objects per plot is  
 
The population consists here of 400 cells (plots) with a total of 190 [[target objects]], so that the parametric mean density in terms of objects per plot is  
  
<math>\mu=\frac{190}{400}=0.475\,</math>
+
:<math>\mu=\frac{190}{400}=0.475\,</math>
  
 
and there are three larger [[clusters]] of target objects. The condition for [[adaptive enlargement]] of the field clusters is “there is at least one target object found in the initial sample, then the enlargement process is initiated”.   
 
and there are three larger [[clusters]] of target objects. The condition for [[adaptive enlargement]] of the field clusters is “there is at least one target object found in the initial sample, then the enlargement process is initiated”.   
Line 15: Line 14:
 
Using the estimator for [[simple random sampling]] for the initial sample thus yields along the known procedure an estimated mean per cell of  
 
Using the estimator for [[simple random sampling]] for the initial sample thus yields along the known procedure an estimated mean per cell of  
  
<math>\bar y=\frac{12}{10}=1.2\,</math>
+
:<math>\bar y=\frac{12}{10}=1.2\,</math>
  
 
with an estimated error variance of  
 
with an estimated error variance of  
  
<math>\hat{var}(\bar y)=\frac{s^2}{n}\frac{N-n}{N}=\frac{11.96}{10}\frac{390}{400}=1.1657\,</math>.  
+
:<math>\hat{var}(\bar y)=\frac{s^2}{n}\frac{N-n}{N}=\frac{11.96}{10}\frac{390}{400}=1.1657\,</math>.  
  
 
Given the [[parametric density]] value of 0.475 (which, however, would be unknown), this is a large deviation; and it is also a relatively large error variance. This is typical for sampling for rare events: estimation errors are usually large.
 
Given the [[parametric density]] value of 0.475 (which, however, would be unknown), this is a large deviation; and it is also a relatively large error variance. This is typical for sampling for rare events: estimation errors are usually large.
Line 25: Line 24:
 
If we do now follow the [[adaptive cluster sampling]] approach, we estimate the mean value as
 
If we do now follow the [[adaptive cluster sampling]] approach, we estimate the mean value as
  
<math>\bar y_1=\frac{1}{10}\left(\frac{36}{6}+\frac{107}{11}+\frac{0}{1}+\dots+\frac{0}{1}\right)=1.573\,</math>
+
:<math>\bar y_1=\frac{1}{10}\left(\frac{36}{6}+\frac{107}{11}+\frac{0}{1}+\dots+\frac{0}{1}\right)=1.573\,</math>
  
 
and the corresponding [[error variance]] as
 
and the corresponding [[error variance]] as
  
<math>\hat{var}(\bar y_1)=\frac{400-10}{400*10*\left(10-1\right)}\left[\left(6-1.573\right)^2+\left(9.727-1.573\right)^2+\left(0-1.573\right)^2+\dots+\left(0-1.573\right)^2\right]=1.147\,</math>.
+
:<math>\hat{var}(\bar y_1)=\frac{400-10}{400*10*\left(10-1\right)}\left[\left(6-1.573\right)^2+\left(9.727-1.573\right)^2+\left(0-1.573\right)^2+\dots+\left(0-1.573\right)^2\right]=1.147\,</math>.
  
 
What is of major interest here, is the error variance because it estimates the average deviation of samples of size <math>n=10</math> from the mean. Of minor (or even no concern) is the absolute deviation of the estimated mean from the true mean). In fact, only calculation of the parametric values will give a final clue to the relative efficiency of the two designs; evaluation of but one sample is not sufficient.
 
What is of major interest here, is the error variance because it estimates the average deviation of samples of size <math>n=10</math> from the mean. Of minor (or even no concern) is the absolute deviation of the estimated mean from the true mean). In fact, only calculation of the parametric values will give a final clue to the relative efficiency of the two designs; evaluation of but one sample is not sufficient.

Revision as of 13:41, 11 January 2011

Forest Inventory lecturenotes
Category Forest Inventory lecturenotes not found


Example 1:

We now wish to compare the statistical performance of simple random sampling with only the initial plots and simple random sampling with adaptive cluster plots. For this, we take the example that is also elaborated in the original publication of Thompson (1992[1]). The population consists here of 400 cells (plots) with a total of 190 target objects, so that the parametric mean density in terms of objects per plot is

\[\mu=\frac{190}{400}=0.475\,\]

and there are three larger clusters of target objects. The condition for adaptive enlargement of the field clusters is “there is at least one target object found in the initial sample, then the enlargement process is initiated”.

An initial sample of size \(n = 10\) plots is taken. Two sampled plots are part of larger networks with \(m_1 = 6\), \(y_1 = 36\), and \(m_2 = 11\), \(y_2 = 107\); where the plots of the initial sample have the observations 11 and 1, respectively. The other eight plots of the initial sample do not contain target objects and have therefore \(m_i = 1\), \(yi = 0\); they are networks of size 1.

Using the estimator for simple random sampling for the initial sample thus yields along the known procedure an estimated mean per cell of

\[\bar y=\frac{12}{10}=1.2\,\]

with an estimated error variance of

\[\hat{var}(\bar y)=\frac{s^2}{n}\frac{N-n}{N}=\frac{11.96}{10}\frac{390}{400}=1.1657\,\].

Given the parametric density value of 0.475 (which, however, would be unknown), this is a large deviation; and it is also a relatively large error variance. This is typical for sampling for rare events: estimation errors are usually large.

If we do now follow the adaptive cluster sampling approach, we estimate the mean value as

\[\bar y_1=\frac{1}{10}\left(\frac{36}{6}+\frac{107}{11}+\frac{0}{1}+\dots+\frac{0}{1}\right)=1.573\,\]

and the corresponding error variance as

\[\hat{var}(\bar y_1)=\frac{400-10}{400*10*\left(10-1\right)}\left[\left(6-1.573\right)^2+\left(9.727-1.573\right)^2+\left(0-1.573\right)^2+\dots+\left(0-1.573\right)^2\right]=1.147\,\].

What is of major interest here, is the error variance because it estimates the average deviation of samples of size \(n=10\) from the mean. Of minor (or even no concern) is the absolute deviation of the estimated mean from the true mean). In fact, only calculation of the parametric values will give a final clue to the relative efficiency of the two designs; evaluation of but one sample is not sufficient.

Here, the adaptive cluster sampling estimator yields a slightly smaller estimated error variance (1.147) than the simple random sampling estimator applied to the initial sample (1.1657); however, the difference is small and if we compare it with the additional fields efforts that need to be undertaken (and paid for) for adaptive cluster sampling, we may have doubts whether in this particular example, the additional effort pays - if interest is only in density estimation; if other attributes are observed at the target objects like diameter, height, quality, etc. this may be completely different (Kleinn 2007[2])!!

References

  1. Thompson SK. 1992. Sampling. John Wiley & Sons. 343 p.
  2. Kleinn, C. 2007. Lecture Notes for the Teaching Module Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology, Georg-August-Universität Göttingen. 164 S.
Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export