Estimating forest area
Contents |
General observations
Area estimation is among the very principal results of large area forest inventory. Estimation can be done for different categories such as forest and non-forest or different forest types. Forest area estimation is also a crucial input to all quantifications or estimations of “deforestation”. A clear definition of “forest” is required in any case. For the determination of forest area there are at least two basic approaches:
- delineation (mapping) from remote sensing imagery, or
- sampling (either in the field or from remote sensing imagery).
Mapping can either been done automatically (in the presence of suitable algorithms) or manually. Various sources of error are associated with this exercise but the result is a map that shows the spatial arrangement of the forest patches. Sampling approach is a different approach that produces an overall estimation – but not a map. It has the advantage that the standard error is easily quantified (if statistical sampling has been used) and gives information about the reliability of the estimation.
Results of area estimations are given either in absolute (\(km^2\) or \(ha\)) or in relative term (%). Also, precision statements can be given either in absolute terms (standard error, confidence interval) or in relative terms (relative standard error, percent confidence interval). If a percent forest cover estimation is accompanied by a precision statement (for example 40% ± 5%) – then it is important to clearly state whether the precision statement is meant in absolute terms (in this case (35% ≤ cover ≤ 45%) or in relative terms (38% ≤ cover ≤ 42%); otherwise, confusions may easily occur.
Area estimation can be based on different plot designs that are discussed in what follows: points, lines and plots.
Area estimation by points
Points are dimensionless plots on which only a very limited set of observations can be made. Therefore, the observations are either yes or no for a set of area classes. Forest area estimation by sample points means that points are selected and for each point it is determined whether it comes to lie in forest or not; that makes the two possible observations \(y=1\) and \(y=0\), respectively. With these observed values, estimations are being carried out as for any other variable. Sample points may be selected at random, systematically, in clusters etc., just all sampling designs are possible. For area estimation, however, the most common approach is a dot grid; that is a systematic sample. The grid is randomly (random point and random orientation) placed over the area of interest and then it is simply counted how many points fall into forest; if, out of the total sample size n there are \(n_f\) forest points, then the forest cover proportion is estimated from
\[\hat{p}=\frac{n_f}{n};\,\]
this, actually, derives from the standard estimator of the mean for simple random sampling, assigning the observation \(y=1\) to the \(n_f\) forest points and the observation \(y=0\) to the \(n_n\) non-forest points.
Then
\[\hat{p}=\frac{\sum_{i}y_i}{n}=\frac{1}{n}\left(n_f*1+n_n*0\right)=\frac{n_f}{n}.\,\]
As for systematic sampling in general, there is no unbiased estimator for the variance estimation. Therefore, we resort to the estimator framework of simple random sampling. There, we know that we may use the binomial distribution to calculate variances. The binomial distribution describes the outcome of “experiments” or observations of variables that can take on only two values (0 or 1, yes or no, black or white). If \(p\) is the true (parametric) proportion of target elements in the population (in our case: the forest cover), then the parametric population variance is
\[\sigma^2=p*q=p*(1-p)\,\],
where \(q=(1-p)\). The variance, obviously, is a function of the mean p only!
The variance \(\sigma^2=p*q\) derives directly from the formula used for calculating the population variance in general:
\[{\sigma_p}^2=\frac{\sum_{i}\left(y_i-\bar{y}\right)^2}{N},\,\]
where \(\bar{y}\) is the true proportion p and there are only two possibility of observations: (1) \(y=1\) for forest points which occur \(n_f\) times where
\[p=\frac{n_f}{N};\,\]
and \(y=0\) for non-forest points which occur \(n_n=(n-n_f)\) times and
.\(q=\frac{n_n}{N}=\frac{N-n_f}{N}.\,\)
We rewrite then
\[\sigma^2=\frac{\sum_{i}(y_i-\bar{y})^2}{N} =\frac{\sum_{i}(y_i-p)^2}{N}\frac{n_f(1-p)^2+n_n(0-p)^2}{N}=\frac{n_f}{N}(1-p)^2+\frac{N-n_f}{N}(0-p)^2=p(1-p)^2+(1-p)^2=p(1-p)=p*q.\,\]
which is the parametric variance. The sample based estimation of that variance has \((n-1)\) degrees of freedom so that the sum of squares needs to be divided by \((n-1)\). We may use the same re-arrangement as before and care for the different denominator simply by introducing the factor
\[\mbox{n/(n-1):}s^2=\frac{\sum_{i}(y_i-\hat{p})^2}{n}*\frac{n}{n-1}=\hat{p}*\hat{q}*\frac{n}{n-1}.\,\]
The estimated error variance of \(\hat{p}\) derives as usual from
\[\hat{var}(\hat{p})=\frac{s^2}{n}\,\]
which in this case is
\[\hat{var}(\hat{p})=\frac{\hat{p}\hat{q}}{n-1}.\,\]
Observe that, contrary to the known error variance estimator for simple random sampling, here the denominator is \((n-1)\) and not n; this should not be confusing any more after the derivations presented.
Area estimation by dimensionless points on a map or on an aerial photograph is usually done as systematic sampling using a dot grid. One may imaging using a transparency sheet on which a dot grid is printed and this transparency sheet is placed randomly on the map of interest.
Here, we may illustrate the characteristics of systematic sampling. An illustration is in Figure 1: there, the forest area is estimated with dot grids of different sizes. The map is given in Figure 1, up; it has a side length of 10000 units. The size of the dot grids used for area estimation is measured in these units.
References