Resource assessment exercises: finite population correction

Latest revision as of 11:11, 23 June 2014

This article is part of the Resource assessment exercises. See the category page for a (chronological) table of contents.

[edit] Finite population correction

If we observe all values $y_{i\in U}$ we talk about a census. The mean is calculated, not estimated, i.e.,

SRSwoR <- sample(trees$dbh, size=N)
mean(SRSwoR)

## [1] 21.05

mean(trees$dbh)

## [1] 21.05

As noted before, SRSwoR stands for simple random sampling without replacement (woR). However, if we take a sample with replacement (SRS; set replace = TRUE) we get a slightly different value. This would be an estimate.

SS <- sample(trees$dbh, size = N, replace = TRUE)
mean(SRS)

## [1] 20.96

If we are interested in a population parameter and take an SRSwoR of size $n=N$, we get the true population value. There is no doubt about its value (assuming that measurement errors are absent). However, if we estimate the standard error for the $n=N$ sample we get a positive value instead of a zero $s_{\bar{y}}$.

sd(SRSwoR)/sqrt(N)

## [1] 0.07416

This cannot be! The estimator of the standard error holds for sampling with replacement. For sampling without replacement we have to correct for the fact that we took a relatively large sample. The finite population correction (fpc) for a relatively large sample is defined as,

$\text{fpc}=1-\frac{n}{N}.$

1

Obviously, if $n=N$ the fpc becomes zero. Suppose we take a sample of size $n=25,000$ from trees, then

S25k <- sample(trees$dbh, size = 25000)
sd(S25k)/sqrt(25000) # without fpc

## [1] 0.08099

fpc <- 1 - 15000/30000
sqrt(var(S25k)/25000 * fpc)

## [1] 0.03307
  
sd(S25k)/sqrt(25000) * sqrt(fpc)

## [1] 0.03307

For the parametric standard error the fpc becomes,

$\text{fpc}=\frac{N-n}{N-1}.$

2

As a rule of thumb, we apply the fpc when the sampling fraction

$f=\frac{n}{N}$

3

exceeds 0.05, i.e., 5 percent.

[edit] Related articles

Previous article: Standard error and confidence intervals
Next article: Required sample size determination

@@ Line 1: / Line 1: @@
-{{construction}}
+: ''This article is part of the '''Resource assessment exercises'''. See the [[:category:Resource assessment exercises 2014|category page]] for a (chronological) table of contents.
 == Finite population correction ==
@@ Line 5: / Line 5: @@
 If we observe all values <math>y_{i\in U}</math> we talk about a census. The mean is calculated, not estimated, i.e.,
-<code>    </code><br />
+<pre>
+SRSwoR <- sample(trees$dbh, size=N)
+mean(SRSwoR)
-<pre>## [1] 21.05</pre>
+## [1] 21.05
-<pre>## [1] 21.05</pre>
-As noted above, SRSwoR stands for simple random sampling ''without'' replacement (woR). However, if we take a sample with replacement (SRS; set <code>replace = TRUE</code>) we get a slightly different value. This would be an estimate.
-<code>       </code><br />
+mean(trees$dbh)
-<pre>## [1] 20.96</pre>
+## [1] 21.05
-If we are interested in a population parameter and take an SRSwoR of size <math>n=N</math>, we get the true population value. There is no doubt about its value (assuming that measurement errors are absent). However, if we estimate the standard error for the <math>n=N</math> sample we get a positive value instead of a zero <math>s_{\bar{y}}</math>.
+</pre>
-<pre>## [1] 0.07416</pre>
+As noted [[Resource assessment exercises: mean, variance and standard deviation|before]], SRSwoR stands for simple random sampling ''without'' replacement (woR). However, if we take a sample with replacement (SRS; set <code>replace = TRUE</code>) we get a slightly different value. This would be an estimate.
-This cannot be! The estimator of the standard error holds for sampling with replacement. For sampling without replacement we have to correct for the fact that we took a relatively large sample. The finite population correction (fpc) for a relatively large sample is defined as,
-<math>\text{fpc}=1-\frac{n}{N}.
+<pre>
-    \label{eeq:fpc}</math>
+SS <- sample(trees$dbh, size = N, replace = TRUE)
+mean(SRS)
-Obviously, if <math>n=N</math> the fpc becomes zero. Suppose we take a sample of size <math>n=25,000</math> from <code>trees</code>, then
+## [1] 20.96
+</pre>
-<code>     </code><br /><code>  </code>
+If we are interested in a population parameter and take an SRSwoR of size <math>n=N</math>, we get the true population value. There is no doubt about its value (assuming that measurement errors are absent). However, if we estimate the standard error for the <math>n=N</math> sample we get a positive value instead of a zero <math>s_{\bar{y}}</math>.
-<pre>## [1] 0.08099</pre>
+<pre>
-<code>    </code><br /><code>  </code>
+sd(SRSwoR)/sqrt(N)
-<pre>## [1] 0.03307</pre>
+## [1] 0.07416
-<code>  </code>
+</pre>
-<pre>## [1] 0.03307</pre>
+This cannot be! The estimator of the standard error holds for sampling with replacement. For sampling without replacement we have to correct for the fact that we took a relatively large sample. The finite population correction (fpc) for a relatively large sample is defined as,
-For the parametric standard error the fpc becomes,
-<math>\text{fpc}=\frac{N-n}{N-1}.
+{{EquationRef|equation=$\text{fpc}=1-\frac{n}{N}.$|1}}
-    \label{eeq:popfpc}</math>
-As a rule of thumb, we apply the fpc when the sampling fraction
+Obviously, if <math>n=N</math> the fpc becomes zero. Suppose we take a sample of size <math>n=25,000</math> from <code>trees</code>, then
-<math>f=\frac{n}{N}
+<pre>
-    \label{eeq:sfrac},</math>
+S25k <- sample(trees$dbh, size = 25000)
+sd(S25k)/sqrt(25000) # without fpc
-exceeds 0.05, i.e., 5 percent.
+## [1] 0.08099
-== Required sample size determination ==
+fpc <- 1 - 15000/30000
+sqrt(var(S25k)/25000 * fpc)
-Above we took a sample <code>S</code> of size <math>n=50</math>.
+## [1] 0.03307
+sd(S25k)/sqrt(25000) * sqrt(fpc)
-<pre>##  [1] 22  8 18 43 21 44 17 25 32 10 11 17  9 10 56 14 14 10 20  8 37 14 55 29 33
+## [1] 0.03307
-## [26] 17 10 15 29  8 21  9  9 24 21 28 19 58 16 16 15 20  5  9 14 30 11  9 12 27</pre>
+</pre>
-The width of the confidence interval was,
-<code>      </code>
+For the parametric standard error the fpc becomes,
-<pre>## [1] 7.364</pre>
+{{EquationRef|equation=$\text{fpc}=\frac{N-n}{N-1}.$|2}}
-Suppose a confidence interval of <math>A=3</math> cm is desired. How large should the sample size, <math>n</math>, be? This can be estimated,
-<math>A=t_{\alpha,n-1}\frac{s}{\sqrt{n}}\rightarrow n=\frac{t_{\alpha, n-1}^2s^2}{A^2}
+As a rule of thumb, we apply the fpc when the sampling fraction
-    \label{eeq:reqn}</math>
-We will use the sample <code>S</code> (<math>n=50</math>) to estimate how many observations we need in our sample. In :
+{{EquationRef|equation=$f=\frac{n}{N}$|3}}
-<code>  </code><br /><code>      </code><br />
+exceeds 0.05, i.e., 5 percent.
-<pre>## [1] 301.2</pre>
+[[category:Resource assessment basics in R (2014)|Finite population correction]]
-We always need to round up!
-<code>  </code>
+==Related articles==
+* Previous article: [[Resource assessment exercises: standard error and confidence intervals|Standard error and confidence intervals]]
-We estimate the width of the confidence interval using the new sample size of <math>n=302</math>.
+* Next article: [[Resource assessment exercises: required sample size determination|Required sample size determination]]
-<code>  </code><br /><code>      </code>
-<pre>## [1] 2.759</pre>
-[[category:Resource assessment basics in R (2014)|Finite population correction]]

Resource assessment exercises: finite population correction

Latest revision as of 11:11, 23 June 2014

[edit] Finite population correction

[edit] Related articles

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Development

Toolbox

Print/export