Statistical sampling

From AWF-Wiki
Jump to: navigation, search

Only statistical sampling provides methodologically sound (and defendable!) estimations. Statistical sampling means essentially probabilistic sampling and that means in turn that the selection probabilities of all selected elements are known.

In design based sampling – and that is the class of sampling techniques that we deal with here, randomization is the only generally accepted selection “philosophy”. That concept had been introduced by Neyman and/or Sir R. A. Fisher in the 1920/30s.

Concepts like “fairness”, objectivity”, “representativeness” should not be used as a general basis for sample selection. Neither subjective selection nor guided or arbitrary selection has a place in statistical sampling, as these selection principles do not allow to define the inclusion probabilities for the selected elements. Statistical analysis techniques should not be applied to such samples.

By no means do we want to say that case studies with subjectively selected samples are useless; the contrary is true: many interesting and useful results and hypotheses can be derived. But these hypotheses need to be confirmed then by statistically sound studies; and statistical estimations or inference must not be applied to subjectively selected sampling elements.

In what refers to the term “representative”, which is frequently used in the context of sampling, a series of 4 papers by Kruskal and Mosteller (1979[1][2][3][4]) is very instructive. These papers deal exclusively with the different uses of the term “representative” in sampling. They found that it is mainly used in the meaning of:

  • general acclaim for data,
  • absence of selective forces,
  • miniature of the population,
  • typical (or ideal) cases,
  • coverage of the population,
  • vague term – to be made precise,
  • representative sampling as a specific sampling method,
  • representative sampling as permitting good estimation and
  • representative sampling as good enough for a particular purpose.

In conclusion, one should avoid the multi-use term “representative” in the context of statistical sampling because there is no clearly defined meaning and everybody imagines something different. It is much better to clearly describe the sampling design, like “simple random sampling”, or “systematic sampling”, because then, everybody will get a clear idea of the actual sampling design applied.

References

  1. Kruskal W. and F. Mosteller 1979. Representative sampling, I: Nonscientific literature. International Statistical Review 47:13-24.
  2. Kruskal W. and F. Mosteller 1979. Representative sampling, II: Scientific literature, excluding statistics. International Statistical Review 47:111-127.
  3. Kruskal W. and F. Mosteller 1979. Representative sampling, III: The current statistical literature. International Statistical Review 47:245-265.
  4. Kruskal W. and F. Mosteller 1980. Representative sampling, IV: The history of the concepts in statistics, 1895-1939. International Statistical Review 48:169-195.

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export