Simple random sampling

From AWF-Wiki
(Difference between revisions)
Jump to: navigation, search
(Notations)
 
(54 intermediate revisions by 2 users not shown)
Line 1: Line 1:
==General observations==
+
{{Ficontent}}
 
Simple random sampling (SRS) is the basic theoretical [[:Category:Sampling design|sampling technique]].  
 
Simple random sampling (SRS) is the basic theoretical [[:Category:Sampling design|sampling technique]].  
The sampling elements are selected as an [[independent random sample]] from the population. Each element of the population has the same probability of being selected. And, likewise, each combination of n sampling elements has the same probability of being eventually selected.
+
The sampling elements are selected as an [[independent random sample]] from the [[population]]. Each element of the population has the same probability of being selected. And, likewise, each combination of ''n'' sampling elements has the same probability of being eventually selected.
  
 
Every possible combination of sampling units from the population has an equal and independent chance of being in the sample.  
 
Every possible combination of sampling units from the population has an equal and independent chance of being in the sample.  
 
   
 
   
Simple random sampling is introduced and dealt with here and in sampling textbooks mainly because it is a very instructive way to learn about sampling; many of the underlying concepts can excellently be explained with simple random sampling. However, it is hardly applied in [[forest inventories|Forest inventory]] because there are various other sampling techniques which are more efficient, given the same sampling effort.
+
Simple random sampling is introduced and dealt with here and in sampling textbooks mainly because it is a very instructive way to learn about sampling; many of the underlying concepts can excellently be explained with simple random sampling. However, it is hardly applied in [[Forest inventory|forest inventories]] because there are various other sampling techniques which are more efficient, given the same sampling effort<ref>Kleinn, C. 2007. Lecture Notes for the Teaching Module  Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology,  Georg-August-Universität Göttingen. 164 S.</ref>.
  
 
For information about how exactly sampling units are choosen see [[Random selection]].
 
For information about how exactly sampling units are choosen see [[Random selection]].
  
 +
==Notations==
  
  
==Random selection (SRS)==
+
{| class="wikitable"
 +
|-
 +
!''Statistic''
 +
!''Parametric value''
 +
!''Sample based estimator''
 +
|-
 +
|Mean
 +
|<math>\mu = \frac{\sum_{i=1}^N y_i}{N}</math>
 +
|<math>\bar {y} = \frac{\sum_{i=1}^n y_i}{n}</math>
 +
|-
 +
|Variance
 +
|<math>\sigma^2 = \frac{\sum_{i=1}^N (y_i - \mu)^2}{N}</math>
 +
|<math>S_y^2 = \frac{\sum_{i=1}^n (y_i - \bar {y})^2}{n-1}</math>
 +
|-
 +
|Standard deviation
 +
|<math>\sigma = \sqrt{\frac{\sum_{i=1}^N (y_i - \mu)^2}{N}}</math>
 +
|<math>S_y = \sqrt{\frac{\sum_{i=1}^n (y_i - \bar {y})^2}{n-1}}</math>
 +
|-
 +
|Standard error
 +
(without replacement or from a finite population)
 +
|<math>\sigma_{\bar {y}} = \sqrt{\frac{N-n}{N-1}}*\frac {\sigma}{\sqrt{n}}</math>
 +
|<math>S_{\bar {y}} = \sqrt{\frac{N-n}{N}}*\frac{S_y}{\sqrt{n}}</math>
 +
|-
 +
|Standard error
 +
(with replacement or from an infinite population)
 +
|<math>\sigma_{\bar {y}} = \frac{\sigma}{\sqrt{n}}</math>
 +
|<math>S_{\bar {y}} = \frac{S_y}{\sqrt{n}}</math>
 +
|}
  
 
 
Simple random selection requires that the sampling elements are independently randomly selected. Randomization is a design component of [[:Category:Sampling design|sampling design]]. The estimators for simple random sampling are unbiased if selection had been done at random. This is why we call such an estimator design-unbiased, because unbiasedness comes from the sampling design. We do not need to make assumptions with respect to the population, as the estimator is unbiased regardless of the structure of the population of interest.
 
 
 
It should be noted, that lack of randomization cannot be compensated by increasing sample size!
 
  
Randomization is one of the most important prerequisite in the so-called class of designed-based sampling (as opposed to model based sampling, where validity comes from the model assumed and randomization is not strictly necessary). However, the spatial structure of the population does affect the precision of our estimates.
+
Where,
Random selection is an essential component of all [[design based sampling]]. In addition, it is the basis for all statistical inference and testing. SRS is easy to implement as long as there is an explicit sampling frame (a list or a map) or known sampling units. Mistakes are frequently made because the term random (equal chance) is confused with haphazard (without any pattern) or with arbitrary (do whatever you wish …).
+
{|
It should be noted that randomization follows very clear rules, equal selection probabilities being the core property. It is hardly possible to simulate a random selection on a map by closing the eyes and pointing to a point in the map. Because the guarantee is not given, that, when doing that very often, really all points are being sampled with equal frequencies.
+
|-
Random numbers as used for randomization are generated by software called “random number generator”; this is a whole science for itself. If randomization has been applied in a study, it is a good practice to also report how it was carried out.
+
| <math>N \!</math> || number of sampling elements in the population (= population size);
   
+
|-
 +
| <math>n \!</math> || number of sampling elements in the sample (= sample size);
 +
|-
 +
| <math>y_i \!</math> || observed value of i-th sampling element;
 +
|-
 +
| <math>\mu \!</math> || parametric mean of the population;
 +
|-
 +
| <math>\bar {y} </math> || estimated mean;
 +
|-
 +
| <math>\sigma \!</math> || standard deviation in the population;
 +
|-
 +
| <math>S \!</math> || estimated standard deviation in the population;
 +
|-
 +
| <math>\sigma^2 \!</math> || parametric variance in the population;
 +
|-
 +
| <math>s^2 \!</math> || estimated variance in the population;
 +
|-
 +
| <math>\sigma_{\bar {y}} </math> || parametric standard error of the mean;
 +
|-
 +
| <math>s_{\bar {y}} </math> || estimated standard error of the mean.
 +
|}
  
  
  
 +
{{Exercise
 +
|message=Simple random sampling examples
 +
|alttext=test
 +
|text=2 exercises for this topic
 +
}}
  
 
+
=References=
 
+
<references/>
{{construction}}
+
 
+
  
 
[[Category:Sampling design]]
 
[[Category:Sampling design]]

Latest revision as of 13:35, 26 October 2013

Simple random sampling (SRS) is the basic theoretical sampling technique. The sampling elements are selected as an independent random sample from the population. Each element of the population has the same probability of being selected. And, likewise, each combination of n sampling elements has the same probability of being eventually selected.

Every possible combination of sampling units from the population has an equal and independent chance of being in the sample.

Simple random sampling is introduced and dealt with here and in sampling textbooks mainly because it is a very instructive way to learn about sampling; many of the underlying concepts can excellently be explained with simple random sampling. However, it is hardly applied in forest inventories because there are various other sampling techniques which are more efficient, given the same sampling effort[1].

For information about how exactly sampling units are choosen see Random selection.

[edit] Notations

Statistic Parametric value Sample based estimator
Mean \(\mu = \frac{\sum_{i=1}^N y_i}{N}\) \(\bar {y} = \frac{\sum_{i=1}^n y_i}{n}\)
Variance \(\sigma^2 = \frac{\sum_{i=1}^N (y_i - \mu)^2}{N}\) \(S_y^2 = \frac{\sum_{i=1}^n (y_i - \bar {y})^2}{n-1}\)
Standard deviation \(\sigma = \sqrt{\frac{\sum_{i=1}^N (y_i - \mu)^2}{N}}\) \(S_y = \sqrt{\frac{\sum_{i=1}^n (y_i - \bar {y})^2}{n-1}}\)
Standard error

(without replacement or from a finite population)

\(\sigma_{\bar {y}} = \sqrt{\frac{N-n}{N-1}}*\frac {\sigma}{\sqrt{n}}\) \(S_{\bar {y}} = \sqrt{\frac{N-n}{N}}*\frac{S_y}{\sqrt{n}}\)
Standard error

(with replacement or from an infinite population)

\(\sigma_{\bar {y}} = \frac{\sigma}{\sqrt{n}}\) \(S_{\bar {y}} = \frac{S_y}{\sqrt{n}}\)


Where,

\(N \!\) number of sampling elements in the population (= population size);
\(n \!\) number of sampling elements in the sample (= sample size);
\(y_i \!\) observed value of i-th sampling element;
\(\mu \!\) parametric mean of the population;
\(\bar {y} \) estimated mean;
\(\sigma \!\) standard deviation in the population;
\(S \!\) estimated standard deviation in the population;
\(\sigma^2 \!\) parametric variance in the population;
\(s^2 \!\) estimated variance in the population;
\(\sigma_{\bar {y}} \) parametric standard error of the mean;
\(s_{\bar {y}} \) estimated standard error of the mean.



Exercise.png Simple random sampling examples: 2 exercises for this topic

[edit] References

  1. Kleinn, C. 2007. Lecture Notes for the Teaching Module Forest Inventory. Department of Forest Inventory and Remote Sensing. Faculty of Forest Science and Forest Ecology, Georg-August-Universität Göttingen. 164 S.
Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export