Adaptive cluster sampling examples

From AWF-Wiki
(Difference between revisions)
Jump to: navigation, search
(Example 1:)
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Languages}}
+
{{Ficontent}}
{{Content Tree|HEADER=Forest Inventory lecturenotes|NAME=Forest Inventory lecturenotes}}
+
 
+
 
===Example 1:===
 
===Example 1:===
 
+
We now wish to compare the statistical performance of simple random sampling with only the initial plots and [[simple random sampling]] with [[adaptive cluster sampling|adaptive cluster plots]]. For this, we take the example that is also elaborated in the original publication of Thompson (1992<ref name="thompson1992">Thompson SK. 1992. Sampling. John Wiley & Sons. 343 p.</ref>).  
We now wish to compare the statistical performance of simple random sampling with only the initial plots and simple random sampling with adaptive cluster plots. For this, we take the example that is also elaborated in the original publication of Thompson (1992<ref name="thompson1992">Thompson SK. 1992. Sampling. John Wiley & Sons. 343 p.</ref>).  
+
The population consists here of 400 cells (plots) with a total of 190 [[target objects]], so that the parametric mean density in terms of objects per plot is  
The population consists here of 400 cells (plots) with a total of 190 target objects, so that the parametric mean density in terms of objects per plot is  
+
  
<math>\mu=\frac{190}{400}=0.475\,</math>
+
:<math>\mu=\frac{190}{400}=0.475\,</math>
  
and there are three larger clusters of target objects. The condition for adaptive enlargement of the field clusters is “there is at least on target object found in the initial sample, then the enlargement process is initiated”.   
+
and there are three larger [[clusters]] of target objects. The condition for adaptive enlargement of the field clusters is “there is at least one target object found in the initial sample, then the enlargement process is initiated”.   
  
 
An initial sample of size <math>n = 10</math> plots is taken. Two sampled plots are part of larger networks with <math>m_1 = 6</math>, <math>y_1 = 36</math>, and <math>m_2 = 11</math>, <math>y_2 = 107</math>; where the plots of the initial sample have the observations 11 and 1, respectively. The other eight plots of the initial sample do not contain target objects and have therefore <math>m_i = 1</math>, <math>yi = 0</math>; they are networks of size 1.  
 
An initial sample of size <math>n = 10</math> plots is taken. Two sampled plots are part of larger networks with <math>m_1 = 6</math>, <math>y_1 = 36</math>, and <math>m_2 = 11</math>, <math>y_2 = 107</math>; where the plots of the initial sample have the observations 11 and 1, respectively. The other eight plots of the initial sample do not contain target objects and have therefore <math>m_i = 1</math>, <math>yi = 0</math>; they are networks of size 1.  
  
Using the estimator for simple random sampling for the initial sample thus yields along the known procedure an estimated mean per cell of  
+
Using the estimator for [[simple random sampling]] for the initial sample thus yields along the known procedure an estimated mean per cell of  
  
<math>\bar y=\frac{12}{10}=1.2\,</math>
+
:<math>\bar y=\frac{12}{10}=1.2\,</math>
  
 
with an estimated error variance of  
 
with an estimated error variance of  
  
<math>\hat{}var(\bar y]=\frac{s^2}{n}\frac{N-n}{N}=\frac{11.96}{10}\frac{390}{400}=1.1657\,</math>.  
+
:<math>\hat{var}(\bar y)=\frac{s^2}{n}\frac{N-n}{N}=\frac{11.96}{10}\frac{390}{400}=1.1657\,</math>.  
  
Given the parametric density value of 0.475 (which, however, would be unknown), this is a large deviation; and it is also a relatively large error variance. This is typical for sampling for rare events: estimation errors are usually large.
+
Given the [[parametric density]] value of 0.475 (which, however, would be unknown), this is a large deviation; and it is also a relatively large error variance. This is typical for sampling for rare events: estimation errors are usually large.
  
If we do now follow the adaptive cluster sampling approach, we estimate the mean value as
+
If we do now follow the [[adaptive cluster sampling]] approach, we estimate the mean value as
  
<math>\bar y_1=\frac{1}{10}\left(\frac{36}{6}+\frac{107}{11}+\frac{0}{1}+\dots+\frac{0}{1}\right)=1.573\,</math>
+
:<math>\bar y_1=\frac{1}{10}\left(\frac{36}{6}+\frac{107}{11}+\frac{0}{1}+\dots+\frac{0}{1}\right)=1.573\,</math>
  
and the corresponding error variance as
+
and the corresponding [[error variance]] as
  
<math>\hat{var}(\bar y_1)=\frac{400-10}{400*10*\left(10-1\right)}\left[\left(6-1.573\right)^2+\left(9.727-1.573\right)^2+\left(0-1.573\right)^2+\dots+\left(0-1.573\right)^2\right]=1.147\,</math>.
+
:<math>\hat{var}(\bar y_1)=\frac{400-10}{400*10*\left(10-1\right)}\left[\left(6-1.573\right)^2+\left(9.727-1.573\right)^2+\left(0-1.573\right)^2+\dots+\left(0-1.573\right)^2\right]=1.147\,</math>.
  
 
What is of major interest here, is the error variance because it estimates the average deviation of samples of size <math>n=10</math> from the mean. Of minor (or even no concern) is the absolute deviation of the estimated mean from the true mean). In fact, only calculation of the parametric values will give a final clue to the relative efficiency of the two designs; evaluation of but one sample is not sufficient.
 
What is of major interest here, is the error variance because it estimates the average deviation of samples of size <math>n=10</math> from the mean. Of minor (or even no concern) is the absolute deviation of the estimated mean from the true mean). In fact, only calculation of the parametric values will give a final clue to the relative efficiency of the two designs; evaluation of but one sample is not sufficient.
  
Here, the adaptive cluster sampling estimator yields a slightly smaller estimated error variance (1.147) than the simple random sampling estimator applied to the initial sample (1.1657); however, the difference is small and if we compare it with the additional fields efforts that need to be undertaken (and paid for) for adaptive cluster sampling, we may have doubts whether in this particular example, the additional effort pays (if interest is ''only'' in density estimation; if other attributes are observed at the target objects like diameter, height, quality, etc. this may be completely different!!).
+
Here, the adaptive cluster sampling estimator yields a slightly smaller estimated error variance (1.147) than the simple random sampling estimator applied to the initial sample (1.1657); however, the difference is small and if we compare it with the additional fields efforts that need to be undertaken (and paid for) for adaptive cluster sampling, we may have doubts whether in this particular example, the additional effort pays - if interest is ''only'' in density estimation; if other attributes are observed at the target objects like diameter, height, quality, etc. this may be completely different (Kleinn 2007<ref name="kleinn2007">Kleinn, C. 2007. Lecture Notes  for the Teaching Module Forest Inventory. Department of Forest Inventory  and Remote Sensing. Faculty of Forest Science and Forest Ecology,  Georg-August-Universität Göttingen. 164 S.</ref>)!!
  
 
==References==
 
==References==
Line 39: Line 36:
 
<references/>
 
<references/>
  
:2.  Kleinn, C. 2007. Lecture Notes for the Teaching Module Forest  Inventory. Department of Forest  Inventory  and Remote Sensing. Faculty  of Forest Science and Forest  Ecology, Georg-August-Universität  Göttingen. 164 S.
+
{{SEO
 +
|keywords=adaptive cluster sampling,sampling technique,sampling design,initial plot,target element,random sample
 +
|descrip=Adaptive cluster sampling is a sampling strategy, that implies enlarging the plot once a target element is found on the initial plot.
 +
}}
  
 
[[Category:Forest Inventory Examples]]
 
[[Category:Forest Inventory Examples]]

Latest revision as of 17:18, 26 October 2013

[edit] Example 1:

We now wish to compare the statistical performance of simple random sampling with only the initial plots and simple random sampling with adaptive cluster plots. For this, we take the example that is also elaborated in the original publication of Thompson (1992[1]). The population consists here of 400 cells (plots) with a total of 190 target objects, so that the parametric mean density in terms of objects per plot is

\[\mu=\frac{190}{400}=0.475\,\]

and there are three larger clusters of target objects. The condition for adaptive enlargement of the field clusters is “there is at least one target object found in the initial sample, then the enlargement process is initiated”.

An initial sample of size \(n = 10\) plots is taken. Two sampled plots are part of larger networks with \(m_1 = 6\), \(y_1 = 36\), and \(m_2 = 11\), \(y_2 = 107\); where the plots of the initial sample have the observations 11 and 1, respectively. The other eight plots of the initial sample do not contain target objects and have therefore \(m_i = 1\), \(yi = 0\); they are networks of size 1.

Using the estimator for simple random sampling for the initial sample thus yields along the known procedure an estimated mean per cell of

\[\bar y=\frac{12}{10}=1.2\,\]

with an estimated error variance of

\[\hat{var}(\bar y)=\frac{s^2}{n}\frac{N-n}{N}=\frac{11.96}{10}\frac{390}{400}=1.1657\,\].

Given the parametric density value of 0.475 (which, however, would be unknown), this is a large deviation; and it is also a relatively large error variance. This is typical for sampling for rare events: estimation errors are usually large.

If we do now follow the adaptive cluster sampling approach, we estimate the mean value as

\[\bar y_1=\frac{1}{10}\left(\frac{36}{6}+\frac{107}{11}+\frac{0}{1}+\dots+\frac{0}{1}\right)=1.573\,\]

and the corresponding error variance as

\[\hat{var}(\bar y_1)=\frac{400-10}{400*10*\left(10-1\right)}\left[\left(6-1.573\right)^2+\left(9.727-1.573\right)^2+\left(0-1.573\right)^2+\dots+\left(0-1.573\right)^2\right]=1.147\,\].

What is of major interest here, is the error variance because it estimates the average deviation of samples of size \(n=10\) from the mean. Of minor (or even no concern) is the absolute deviation of the estimated mean from the true mean). In fact, only calculation of the parametric values will give a final clue to the relative efficiency of the two designs; evaluation of but one sample is not sufficient.

Here, the adaptive cluster sampling estimator yields a slightly smaller estimated error variance (1.147) than the simple random sampling estimator applied to the initial sample (1.1657); however, the difference is small and if we compare it with the additional fields efforts that need to be undertaken (and paid for) for adaptive cluster sampling, we may have doubts whether in this particular example, the additional effort pays - if interest is only in density estimation; if other attributes are observed at the target objects like diameter, height, quality, etc. this may be completely different (Kleinn 2007[2])!!

[edit] References

  1. Thompson SK. 1992. Sampling. John Wiley & Sons. 343 p.
  2. Kleinn, C. 2007. Lecture Notes for the Teaching Module Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology, Georg-August-Universität Göttingen. 164 S.

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export