Double sampling

From AWF-Wiki
Revision as of 17:35, 25 December 2010 by Fheimsch (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Forest Inventory lecturenotes
Category Forest Inventory lecturenotes not found


Contents

Introduction

For the ratio estimator and the regression estimator we stipulated, that the parametric mean or the true total of the ancillary variable need to be known, in order to apply those estimators. In some cases, this is a very unpleasant situation, because the population values might not be known. A way out is to also estimate these values. This is exactly what double sampling is about, also referred to as two-phase sampling: in a first phase, the ancillary variable is estimated, usually with a relatively large sample of a variable that is relatively easy and inexpensive to observe. Then, in a second phase, a smaller sample is taken of the target variable, which is frequently a variable much more expensive or difficult to observe; simultaneously, however, also the ancillary variable is observed, so that a relationship between target and ancillary variable can be established (either a ratio in the case of double sampling with the ratio estimator or a regression in the case of double sampling with the regression estimator). Here, the correlation to the ancillary variable is also used to reduce the sample size in the second phase.


Observe:

  • Here, we deal with double sampling, with simple random sampling in both phases. The estimators given here are valid only for that sampling design. If other sampling designs are used, or different designs in the two phases, the corresponding estimators must be searched for or developed.

  • Double sampling can either be carried out with dependent phases or with independent phases. Dependent phases are there, when the second phase sample is a sub-sample of the first phase sample. That is: a sub-set of randomly selected samples of the first phase is re-visited and in addition to the ancillary variable the target variable is observed. In the case of independent phases, the second phase sample has nothing to do with what had been sampled in the first phase. In that case the ancillary variable has also newly to be observed.

  • Do not confuse two-phase sampling with two-stage sampling. It is a completely different concept that bases on the subdivision of the population in primary and secondary units.

  • The idea of two-phase sampling as presented in this chapter can also be extended to more than two phases. However, the more phases, the more complex the estimators.


In addition to double sampling with the ratio estimator and double sampling with the regression estimator, there is a third variation of double sampling, some times used in forest inventory: double sampling for stratification.


Double sampling for stratification (DSS)

General remarks

In the article on stratified random sampling it was mentioned, that there are occasions in which it is not possible or too difficult to make a clear delimitation of strata before sampling. In those cases, a so-called post-stratification can be done, or the stratification is integrated into the sampling process. And this exactly what double sampling for stratification does: in the first phase, a relatively large sample is taken and the only variable observed is to which stratum the samples belong – whatever the criteria are that are to be used for stratification. The first phase, therefore, serves to estimate the strata sizes. We may say that in the first phase per sample point a categorical variable is observed which can take on L different values, the number of strata to be distinguished. This is the ancillary variable of the first phase.


In the second phase, a stratified sub-sample is taken from the first phase samples. This is obviously sampling with dependent phases because the value of the ancillary variable is used to guide the second phase stratified sampling. The target variable is then observed on these second phase samples, and estimation is done along the estimators for stratified sampling which must now, obviously, contain further components that account for the estimation error in strata size determination.


In double sampling for stratification, strata sizes need not to be known before sampling starts. In many cases, the number and type of strata are defined; but even that can be done during the first phase analysis process: if, for example, in an open forest a stratification shall be done according to crown cover one could observe crown cover in the first phase samples and then decide in the analysis process (when the frequency distribution of crown cover values is known) how many strata to distinguish along which crown cover thresholds.


Notation

Notation in double sampling for stratification resembles that for stratified random sampling, but the two phase feature must come in:

\(L\,\) Number of Strata;
\(n'\,\) Total number of samples in the first phase;
\(n'_{h}\,\) Numbers of samples in h stratum in the first phase;
\( w'_{h}\,\) Weight of stratum h;
\( \bar y_h\) Etimated mean od target variable Y in stratum h;
\( \bar y\) Estimated mean of the target variable Y for entire area of interest;
\(s^2_{h}\) Estimated variance of the target variable Y within \(h^{th}\) stratum
Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export