Resource assessment exercises: finite population correction

From AWF-Wiki
(Difference between revisions)
Jump to: navigation, search
 
(3 intermediate revisions by one user not shown)
Line 1: Line 1:
{{construction}}
+
: ''This article is part of the '''Resource assessment exercises'''. See the [[:category:Resource assessment exercises 2014|category page]] for a (chronological) table of contents.
  
[[category:Resource assessment basics|Finite population correction]]
+
== Finite population correction ==
 +
 
 +
If we observe all values <math>y_{i\in U}</math> we talk about a census. The mean is calculated, not estimated, i.e.,
 +
 
 +
<pre>
 +
SRSwoR <- sample(trees$dbh, size=N)
 +
mean(SRSwoR)
 +
 
 +
## [1] 21.05
 +
 
 +
mean(trees$dbh)
 +
 
 +
## [1] 21.05
 +
</pre>
 +
 
 +
As noted [[Resource assessment exercises: mean, variance and standard deviation|before]], SRSwoR stands for simple random sampling ''without'' replacement (woR). However, if we take a sample with replacement (SRS; set <code>replace = TRUE</code>) we get a slightly different value. This would be an estimate.
 +
 
 +
<pre>
 +
SS <- sample(trees$dbh, size = N, replace = TRUE)
 +
mean(SRS)
 +
 
 +
## [1] 20.96
 +
</pre>
 +
 
 +
If we are interested in a population parameter and take an SRSwoR of size <math>n=N</math>, we get the true population value. There is no doubt about its value (assuming that measurement errors are absent). However, if we estimate the standard error for the <math>n=N</math> sample we get a positive value instead of a zero <math>s_{\bar{y}}</math>.
 +
 
 +
<pre>
 +
sd(SRSwoR)/sqrt(N)
 +
 
 +
## [1] 0.07416
 +
</pre>
 +
 
 +
This cannot be! The estimator of the standard error holds for sampling with replacement. For sampling without replacement we have to correct for the fact that we took a relatively large sample. The finite population correction (fpc) for a relatively large sample is defined as,
 +
 
 +
{{EquationRef|equation=$\text{fpc}=1-\frac{n}{N}.$|1}}
 +
 
 +
Obviously, if <math>n=N</math> the fpc becomes zero. Suppose we take a sample of size <math>n=25,000</math> from <code>trees</code>, then
 +
 
 +
<pre>
 +
S25k <- sample(trees$dbh, size = 25000)
 +
sd(S25k)/sqrt(25000) # without fpc
 +
 
 +
## [1] 0.08099
 +
 
 +
fpc <- 1 - 15000/30000
 +
sqrt(var(S25k)/25000 * fpc)
 +
 
 +
## [1] 0.03307
 +
  
 +
sd(S25k)/sqrt(25000) * sqrt(fpc)
 +
 
 +
## [1] 0.03307
 +
</pre>
 +
 
 +
For the parametric standard error the fpc becomes,
 +
 
 +
{{EquationRef|equation=$\text{fpc}=\frac{N-n}{N-1}.$|2}}
 +
 
 +
As a rule of thumb, we apply the fpc when the sampling fraction
 +
 
 +
{{EquationRef|equation=$f=\frac{n}{N}$|3}}
 +
 
 +
exceeds 0.05, i.e., 5 percent.
 +
 
 +
[[category:Resource assessment basics in R (2014)|Finite population correction]]
 +
 
 +
==Related articles==
 +
* Previous article: [[Resource assessment exercises: standard error and confidence intervals|Standard error and confidence intervals]]
 +
* Next article: [[Resource assessment exercises: required sample size determination|Required sample size determination]]

Latest revision as of 11:11, 23 June 2014

This article is part of the Resource assessment exercises. See the category page for a (chronological) table of contents.

[edit] Finite population correction

If we observe all values \(y_{i\in U}\) we talk about a census. The mean is calculated, not estimated, i.e.,

SRSwoR <- sample(trees$dbh, size=N)
mean(SRSwoR)

## [1] 21.05

mean(trees$dbh)

## [1] 21.05

As noted before, SRSwoR stands for simple random sampling without replacement (woR). However, if we take a sample with replacement (SRS; set replace = TRUE) we get a slightly different value. This would be an estimate.

SS <- sample(trees$dbh, size = N, replace = TRUE)
mean(SRS)

## [1] 20.96

If we are interested in a population parameter and take an SRSwoR of size \(n=N\), we get the true population value. There is no doubt about its value (assuming that measurement errors are absent). However, if we estimate the standard error for the \(n=N\) sample we get a positive value instead of a zero \(s_{\bar{y}}\).

sd(SRSwoR)/sqrt(N)

## [1] 0.07416

This cannot be! The estimator of the standard error holds for sampling with replacement. For sampling without replacement we have to correct for the fact that we took a relatively large sample. The finite population correction (fpc) for a relatively large sample is defined as,


$\text{fpc}=1-\frac{n}{N}.$ 1


Obviously, if \(n=N\) the fpc becomes zero. Suppose we take a sample of size \(n=25,000\) from trees, then

S25k <- sample(trees$dbh, size = 25000)
sd(S25k)/sqrt(25000) # without fpc

## [1] 0.08099

fpc <- 1 - 15000/30000
sqrt(var(S25k)/25000 * fpc)

## [1] 0.03307
  
sd(S25k)/sqrt(25000) * sqrt(fpc)

## [1] 0.03307

For the parametric standard error the fpc becomes,


$\text{fpc}=\frac{N-n}{N-1}.$ 2


As a rule of thumb, we apply the fpc when the sampling fraction


$f=\frac{n}{N}$ 3


exceeds 0.05, i.e., 5 percent.

[edit] Related articles

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export