Resource assessment exercises: estimating a proportion
sorry: |
This section is still under construction! This article was last modified on 06/23/2014. If you have comments please use the Discussion page or contribute to the article! |
As mentioned in the introduction there are two tree species in the example population: beech trees and oak trees. Suppose we would like to estimate the proportion of beech trees by looking at a sample of \(n=50\) again.
s.species <- sample(trees$species, size = 50) s.species ## [1] 2 1 2 1 2 2 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 2 2 1 1 2 1 2 2 1 2 1 1 1 1 1 1 2 ## [39] 2 2 2 2 2 2 2 2 2 1 1 2
The 1s are oak trees and the 2s are beech trees. The proportion of beech trees is estimated by
\( \hat{p}=\frac{y_{i\in s}}{n}\quad\text{where}\quad y_i = \begin{array}{l l} 1 & \quad \text{if UNIQ6292d62a272b27e5-MathJax-1-QINU is a beech tree}\\ 0 & \quad \text{otherwise} \end{array} \)
In R:
n.beech <- length(s.species[s.species == 2]) n.beech ## [1] 34 p <- n.beech/n p ## [1] 0.68
We estimated that 68% of the trees in the population are beech trees. The estimated number of oak trees in the population is \(\hat{q}=1-\hat{p}\). The standard error of the estimated proportion \(\hat{p}\) is given by,
$s_{\hat{p} }=\sqrt{\frac{\hat{p}\times\hat{q} }{n-1} }$ | 2 |
In R:
q <- 1 - p sqrt((p * q)/(n - 1)) ## [1] 0.06664
Confidence intervals can be constructed in the same way as for the mean estimator above.
Since we know the species of each tree in the population, we can calculate the true proportion of beech trees,
nrow(trees[trees$species == 2, ])/nrow(trees) # true proportion of beech trees ## [1] 0.6702