Resource assessment exercises: estimating a proportion

From AWF-Wiki
(Difference between revisions)
Jump to: navigation, search
 
(15 intermediate revisions by one user not shown)
Line 1: Line 1:
{{construction}}
+
: ''This article is part of the '''Resource assessment exercises'''. See the [[:category:Resource assessment exercises 2014|category page]] for a (chronological) table of contents.
As mentioned in the introduction there are two tree species in the example population: beech trees and oak trees. Suppose we would like to estimate the proportion of beech trees by looking at a sample of <math>n=50</math> again.
+
  
 +
As mentioned in the [[Introduction to resource assessment exercises|introduction]] there are two tree species in the example population: beech trees and oak trees. Suppose we would like to estimate the proportion of beech trees by looking at a sample of <math>n=50</math> again.
 
     
 
     
 
<pre>
 
<pre>
Line 14: Line 14:
 
The 1s are oak trees and the 2s are beech trees. The proportion of beech trees is estimated by
 
The 1s are oak trees and the 2s are beech trees. The proportion of beech trees is estimated by
  
<math>
+
{{EquationRef|equation=$\hat{p}=\frac{y_{i\in s} }{n}\quad\text{where}\quad y_i = \left\{ \begin{array}{l l} 1 & \quad \text{if} \,i\, \text{is a beech tree}\\  0 & \quad \text{otherwise} \end{array} \right.$|1}}
\hat{p}=\frac{y_{i\in s}}{n}\quad\text{where}\quad y_i = \left\{ \begin{array}{l l}
+
1 & \quad \text{if i is a beech tree}\\
+
  0 & \quad \text{otherwise}
+
\end{array} \right.
+
</math>
+
  
 
In [[:wikipedia:R_(programming_language)|R]]:
 
In [[:wikipedia:R_(programming_language)|R]]:
Line 35: Line 30:
 
## [1] 0.68
 
## [1] 0.68
 
</pre>
 
</pre>
 +
 +
{{info|message=Indexing using logical operators|text=The double equal sign <code>==</code> is used to access entries that exactly match a condition. Above we select all entries <code>s.species</code> for which <code>s.species == 2</code> is true. Other operators include <code>!=</code> (not equal to), <code>&lt;</code> (smaller than), <code>&gt;</code> (larger than), <code>&lt;=</code> (smaller or equal to),  <code>&gt;=</code> (larger or equal to).}} 
  
 
We estimated that 68% of the trees in the population are beech trees. The estimated number of oak trees in the population is <math>\hat{q}=1-\hat{p}</math>. The standard error of the estimated proportion <math>\hat{p}</math> is given by,
 
We estimated that 68% of the trees in the population are beech trees. The estimated number of oak trees in the population is <math>\hat{q}=1-\hat{p}</math>. The standard error of the estimated proportion <math>\hat{p}</math> is given by,
Line 58: Line 55:
 
## [1] 0.6702
 
## [1] 0.6702
 
</pre>
 
</pre>
 +
 +
==Related articles==
 +
* Previous article: [[Resource assessment exercises: required sample size determination|Required sample size determination]]
 +
* Next article: [[Resource assessment exercises: basic statistics additional exercises]]
  
 
[[category:Resource assessment basics in R (2014)|Estimating a proportion]]
 
[[category:Resource assessment basics in R (2014)|Estimating a proportion]]

Latest revision as of 14:59, 23 June 2014

This article is part of the Resource assessment exercises. See the category page for a (chronological) table of contents.

As mentioned in the introduction there are two tree species in the example population: beech trees and oak trees. Suppose we would like to estimate the proportion of beech trees by looking at a sample of \(n=50\) again.      

s.species <- sample(trees$species, size = 50)
s.species

##  [1] 2 1 2 1 2 2 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 2 2 1 1 2 1 2 2 1 2 1 1 1 1 1 1 2
## [39] 2 2 2 2 2 2 2 2 2 1 1 2


The 1s are oak trees and the 2s are beech trees. The proportion of beech trees is estimated by


$\hat{p}=\frac{y_{i\in s} }{n}\quad\text{where}\quad y_i = \left\{ \begin{array}{l l} 1 & \quad \text{if} \,i\, \text{is a beech tree}\\ 0 & \quad \text{otherwise} \end{array} \right.$ 1


In R:

    

n.beech <- length(s.species[s.species == 2])
n.beech

## [1] 34
  
p <- n.beech/n
p

## [1] 0.68


info.png Indexing using logical operators
The double equal sign == is used to access entries that exactly match a condition. Above we select all entries s.species for which s.species == 2 is true. Other operators include != (not equal to), < (smaller than), > (larger than), <= (smaller or equal to), >= (larger or equal to).

We estimated that 68% of the trees in the population are beech trees. The estimated number of oak trees in the population is \(\hat{q}=1-\hat{p}\). The standard error of the estimated proportion \(\hat{p}\) is given by,


$s_{\hat{p} }=\sqrt{\frac{\hat{p}\times\hat{q} }{n-1} }$ 2


In R:         

q <- 1 - p
sqrt((p * q)/(n - 1))

## [1] 0.06664

Confidence intervals can be constructed in the same way as for the mean estimator above.

Since we know the species of each tree in the population, we can calculate the true proportion of beech trees,

nrow(trees[trees$species == 2, ])/nrow(trees) # true proportion of beech trees

## [1] 0.6702

[edit] Related articles

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export