Sampling with unequal selection probabilities

From AWF-Wiki
(Difference between revisions)
Jump to: navigation, search
(Introduction: info shifted to the end)
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Content Tree|HEADER=Forest Inventory lecturenotes|NAME=Forest Inventory lecturenotes}}
+
{{Ficontent}}
 
+
Mostly, one speaks about [[simple random sampling|random sampling]] with equal selection probabilities: each element of the [[population]] has the same probability to be selected. However, there are situations in which this idea of equal selection probabilities does not appear reasonable: if it is known that some elements carry much more information about the [[target variable]], they should also have a greater chance to be selected. [[Stratified sampling|Stratification]] goes into that direction: there, the [[Inclusion probability]] within the strata are the same, but could be different between strata.
 
+
==Introduction==
+
 
+
This article is, if not explicitly stated otherwise, based upon the lecture notes for the teaching modul "Forest Inventory" by Kleinn ''et al.'' (2007<ref>Kleinn, C.2007. Lecture Notes for the Teaching Module Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology, Georg-August-Universität Göttingen. 164 S.</ref>).
+
 
+
Mostly, one speaks about [[simple random sampling|random sampling]] with equal selection probabilities: each element of the [[population]] has the same probability to be selected. However, there are situations in which this idea of equal selection probabilities does not appear reasonable: if it is known that some elements carry much more information about the [[target variable]], they should also have a greater chance to be selected. [[Stratified sampling|Stratification]] goes into that direction: there, the [[selection probabilities]] within the strata were the same, but could be different between strata.
+
 
+
 
+
  
 
Sampling with unequal selection probabilities is still random sampling, but not [[simple random sampling]], but “random sampling with unequal selection probabilities”. These selection probabilities, of course, must be defined for each and every element of the population before sampling and none of the population elements must have a selection probability of 0.
 
Sampling with unequal selection probabilities is still random sampling, but not [[simple random sampling]], but “random sampling with unequal selection probabilities”. These selection probabilities, of course, must be defined for each and every element of the population before sampling and none of the population elements must have a selection probability of 0.
  
 
Various [[:category:sampling design|sampling strategies]] that are important for forest inventory base upon the principle of unequal selection probabilities, including  
 
Various [[:category:sampling design|sampling strategies]] that are important for forest inventory base upon the principle of unequal selection probabilities, including  
 
  
 
*angle count sampling ([[Bitterlich sampling]]),
 
*angle count sampling ([[Bitterlich sampling]]),
Line 23: Line 14:
 
*[[randomized branch sampling]].
 
*[[randomized branch sampling]].
 
   
 
   
 
 
 
In unequal probability sampling, we distinguish two different probabilities – which actually are two different points of view on the sampling process:
 
In unequal probability sampling, we distinguish two different probabilities – which actually are two different points of view on the sampling process:
 
   
 
   
 
The selection probability is the probability that element ''i'' is selected at one draw (selection step). The [[Hansen-Hurwitz estimator]] for sampling with replacement (that is; when the selection probabilities do not change after every draw) bases on this probability. The notation for selection probability is written as <math>P_i</math> or <math>p_i</math>.
 
The selection probability is the probability that element ''i'' is selected at one draw (selection step). The [[Hansen-Hurwitz estimator]] for sampling with replacement (that is; when the selection probabilities do not change after every draw) bases on this probability. The notation for selection probability is written as <math>P_i</math> or <math>p_i</math>.
  
The [[inclusion probability]] refers to the probability that element ''i'' is eventually (or included) in the sample of size ''n''. The [[Horvitz-Thompson estimator]] bases on the inclusion probability and is applicable to sampling with or without replacement. The inclusion probability is generally denoted by <math>\pi</math>.
+
The [[inclusion probability]] refers to the probability that element ''i'' is eventually (or included) in the sample of size ''n''. The [[Horvitz-Thompson estimator]] bases on the inclusion probability and is applicable to sampling with or without replacement. The inclusion probability is generally denoted by <math>\pi_i</math>.
  
 
{{info
 
{{info
 
|message=obs:
 
|message=obs:
|text=A typical example forsampling with equal inclusion probabilities is given with fixed area[[sample plots]] in forest inventories. With this concept and under theassumption that sample points are randomly distributed over an area ofinterest, each tree has the same probability to become part of asample. Contrary to this constant [[inclusion probability]] it ispossible to weight the probability proportional to a meaningfulvariable. Imagine e.g. different plot sizes for different treedimensions. If bigger trees are observed in larger plots and smallertrees in smaller plots, their probability to be included in a sample isnot constant anymore. This weighting is in particular efficient, if theinclusion probability is proportional to the respective target variable(like e.g. in relascope sampling)
+
|text=A typical example for sampling with equal inclusion probabilities is given with fixed area [[fixed area plots|sample plots]] in forest inventories. With this concept and under the assumption that sample points are randomly distributed over an area of interest, each tree has the same probability to become part of a sample. Contrary to this constant [[inclusion probability]] it is possible to weight the probability proportional to a meaningful variable. Imagine e.g. different plot sizes for different tree dimensions. If bigger trees are observed in larger plots and smaller trees in smaller plots, their probability to be included in a sample is not constant anymore. This weighting is in particular efficient, if the inclusion probability is proportional to the respective target variable(like e.g. in relascope sampling)
 
}}
 
}}
  

Latest revision as of 13:34, 26 October 2013

Mostly, one speaks about random sampling with equal selection probabilities: each element of the population has the same probability to be selected. However, there are situations in which this idea of equal selection probabilities does not appear reasonable: if it is known that some elements carry much more information about the target variable, they should also have a greater chance to be selected. Stratification goes into that direction: there, the Inclusion probability within the strata are the same, but could be different between strata.

Sampling with unequal selection probabilities is still random sampling, but not simple random sampling, but “random sampling with unequal selection probabilities”. These selection probabilities, of course, must be defined for each and every element of the population before sampling and none of the population elements must have a selection probability of 0.

Various sampling strategies that are important for forest inventory base upon the principle of unequal selection probabilities, including

In unequal probability sampling, we distinguish two different probabilities – which actually are two different points of view on the sampling process:

The selection probability is the probability that element i is selected at one draw (selection step). The Hansen-Hurwitz estimator for sampling with replacement (that is; when the selection probabilities do not change after every draw) bases on this probability. The notation for selection probability is written as \(P_i\) or \(p_i\).

The inclusion probability refers to the probability that element i is eventually (or included) in the sample of size n. The Horvitz-Thompson estimator bases on the inclusion probability and is applicable to sampling with or without replacement. The inclusion probability is generally denoted by \(\pi_i\).


info.png obs:
A typical example for sampling with equal inclusion probabilities is given with fixed area sample plots in forest inventories. With this concept and under the assumption that sample points are randomly distributed over an area of interest, each tree has the same probability to become part of a sample. Contrary to this constant inclusion probability it is possible to weight the probability proportional to a meaningful variable. Imagine e.g. different plot sizes for different tree dimensions. If bigger trees are observed in larger plots and smaller trees in smaller plots, their probability to be included in a sample is not constant anymore. This weighting is in particular efficient, if the inclusion probability is proportional to the respective target variable(like e.g. in relascope sampling)

[edit] References

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export