Approaches to populations of sample plots
(→General observations) |
m |
||
(9 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
− | {{ | + | {{Ficontent}} |
__TOC__ | __TOC__ | ||
− | |||
− | |||
− | |||
− | |||
− | |||
In [[forest inventory]] field [[:Category:Sampling design|sampling]], the sampling elements that we select and observe are [[Sample plot|sample plots]]. Consequently, the [[population]] from which we sample is a population of sample plots. | In [[forest inventory]] field [[:Category:Sampling design|sampling]], the sampling elements that we select and observe are [[Sample plot|sample plots]]. Consequently, the [[population]] from which we sample is a population of sample plots. | ||
− | + | ==Approach 1== | |
− | An intuitive approach is to imagine this population as the set of sample plots that covers the area of interest completely. This is depicted in Figure 1 for the two examples of square and hexagonal fixed area sample plots. The population consists of a discrete number of plots (see Figure 3, left). The sampling process is to select some of these plots. However, this approach does pose a series of interpretation and analysis problems; it works only for [[fixed area plots]] and within the class of fixed | + | [[File:4.8-fig68.png|right|thumb|300px|'''Figure 1''' Illustration of approach 1 for plot populations: subdivision of the forest area into sample plots of identical shape and size, here: square and hexagonal sample plots. Such a subdivision is also possible for rectangles and some types of triangles (Kleinn and Vilcko 2005<ref name="kleinn_vilcko2005">Kleinn C. und F. Vilčko. 2005. Ein Vergleich von zwei methodischen Konzepten für die Grundgesamtheit von Probeflächen bei Waldinventuren. AFJZ 176(4):68-74.</ref>).]] |
+ | [[File:4.8-fig69.png|right|thumb|300px|'''Figure 2''' An identical “forest area” subdivided in two different ways in square sample plots of the same basic size. Right: plot fragments occur along the border line. The total of “number of stems” is obviously identical in both cases; but this is not the case for the per-plot parametric mean and variance (Kleinn and Vilcko 2005<ref name="kleinn_vilcko2005">Kleinn C. und F. Vilčko. 2005. Ein Vergleich von zwei methodischen Konzepten für die Grundgesamtheit von Probeflächen bei Waldinventuren. AFJZ 176(4):68-74.</ref>).]] | ||
+ | [[File:4.8-fig70.png|right|thumb|300px|'''Figure 3''' Illustration of approaches for plot populations for the same example population. Left: discrete population of square sample plots with defined positions. Right: each point in the area is a sampling element the value of which is determined by the surrounding trees (sample plot). Each point has a value which is here indicated through the cloud of points; in addition a trend surface is given (Kleinn and Vilcko 2005<ref name="kleinn_vilcko2005">Kleinn C. und F. Vilčko. 2005. Ein Vergleich von zwei methodischen Konzepten für die Grundgesamtheit von Probeflächen bei Waldinventuren. AFJZ 176(4):68-74.</ref>).]] | ||
+ | An intuitive approach is to imagine this population as the set of sample plots that covers the area of interest completely. This is depicted in Figure 1 for the two examples of square and hexagonal fixed area sample plots. The population consists of a discrete number of plots (see Figure 3, left). The sampling process is to select some of these plots. However, this approach does pose a series of interpretation and analysis problems; it works only for [[fixed area plots]] and within the class of fixed area plots only for some plot shapes. For the most frequently used circular plot, for example, this concept can not be applied. | ||
− | Also, once the [[Forest Definition|forest]] area of interest is defined (the area sampling frame), it is not clearly defined as well how the plots come to lie. Figure 2 shows one and the same population over which two different grids of plots are overlaid. In one case the square plots fit perfectly into the square area [[sampling frame]], in the other case, there are various [[border plots]] with smaller size. The per-plot mean and variance will be different between the two sub-divisions; this is certainly an undesirable property of a population concept that the population parameters are not clearly defined. We conclude that this simple concept of populations of sample plots is not well suited as a model for forest inventory sampling. | + | Also, once the [[Forest Definition|forest]] area of interest is defined (the area sampling frame), it is not clearly defined as well how the plots come to lie. Figure 2 shows one and the same population over which two different grids of plots are overlaid. In one case the square plots fit perfectly into the square area [[population|sampling frame]], in the other case, there are various [[Fixed area plots at the stand boundary|border plots]] with smaller size. The per-plot mean and variance will be different between the two sub-divisions; this is certainly an undesirable property of a population concept that the population parameters are not clearly defined. We conclude that this simple concept of populations of sample plots is not well suited as a model for forest inventory sampling. |
− | + | ||
− | + | ||
− | [[ | + | ==Approach 2== |
+ | |||
+ | The [[Inclusion probability|inclusion zone concept]] bases on the idea to select points from the area sampling frame around which the sample [[Tree Definition|trees]] are being selected along a defined plot design. This is a completely different view at the population of sample plots. It is some times called “the [[infinite population approach]]” because the population consists of the infinite number of points within the areal sampling frame. Each point possesses a characteristic which is being observed. However, that value is not being observed at the point itself, but it derives from the sample trees which are tallied around that sample point according to the rules that are defined by the plot design. | ||
The inclusion zone concept and the infinite population approach belong together. Each tree ''k'' has its inclusion zone <math>E_k</math>. While the size of this inclusion zone can be taken as a measure for the selection probability of that particular tree ''k'', we may also follow a straightforward geometric interpretation that helps understanding the estimation, which is, in fact, based on unequal probability sampling (see also [[sampling with unequal selection probabilities]]). | The inclusion zone concept and the infinite population approach belong together. Each tree ''k'' has its inclusion zone <math>E_k</math>. While the size of this inclusion zone can be taken as a measure for the selection probability of that particular tree ''k'', we may also follow a straightforward geometric interpretation that helps understanding the estimation, which is, in fact, based on unequal probability sampling (see also [[sampling with unequal selection probabilities]]). | ||
Line 23: | Line 21: | ||
If the tree ''k'' has an attribute value <math>y_k</math> (for example basal area in <math>m^2</math>; or simply number of trees which is <math>y_k=1</math> for each tree, obviously) we imagine this value distributed evenly over the inclusion zone. Geometrically, that is a disk with the inclusion zone as base area <math>a_k</math> and a height which is defined by <math>y_k</math> as | If the tree ''k'' has an attribute value <math>y_k</math> (for example basal area in <math>m^2</math>; or simply number of trees which is <math>y_k=1</math> for each tree, obviously) we imagine this value distributed evenly over the inclusion zone. Geometrically, that is a disk with the inclusion zone as base area <math>a_k</math> and a height which is defined by <math>y_k</math> as | ||
− | :<math>d_k=\frac{y_k}{a_k} | + | :<math>d_k=\frac{y_k}{a_k}\,</math> |
The value of <math>d_k</math> is constant over the entire inclusion zone and may be interpreted as density value. For the variable number of stems, | The value of <math>d_k</math> is constant over the entire inclusion zone and may be interpreted as density value. For the variable number of stems, | ||
− | :<math>d_k=\frac{1}{a_k} | + | :<math>d_k=\frac{1}{a_k}\,</math> |
− | because <math>y_k=1</math> | + | because <math>y_k=1</math> |
The value of one sample element in the infinite population of sample points results from the sum of all such density values ''d'' which are present at the particular position <math>x1_i</math>, <math>x2_i</math>: | The value of one sample element in the infinite population of sample points results from the sum of all such density values ''d'' which are present at the particular position <math>x1_i</math>, <math>x2_i</math>: | ||
− | :<math>d(x1_i;x2_i)=\sum_{(x1_i | + | :<math>d(x1_i;x2_i)=\sum_{(x1_i,x2_i){\cap}E_k}d_k=\sum_{(x1_i;x2_i){\cap}E_k}\frac{y_k}{a_k}\,</math> |
That defines eventually the infinite population of points which is indicated in Figure 3, right, as a cloud of points. The population total <math>\tau</math> is then the integral over the entire area of all inclusion zones | That defines eventually the infinite population of points which is indicated in Figure 3, right, as a cloud of points. The population total <math>\tau</math> is then the integral over the entire area of all inclusion zones | ||
− | :<math>\tau=\iint_{x_2,x_1}\,d(x1,x2)\,dx1,dx2 | + | :<math>\tau=\iint_{x_2,x_1}\,d(x1,x2)\,dx1,dx2\,</math> |
Observe that this implies to integrate also outside the areal sampling frame of sample points where inclusion zones of border trees are outside the defined inventory area. This leads to the border correction issue which is dealt with in [[Fixed area plots at the stand boundary]]. | Observe that this implies to integrate also outside the areal sampling frame of sample points where inclusion zones of border trees are outside the defined inventory area. This leads to the border correction issue which is dealt with in [[Fixed area plots at the stand boundary]]. | ||
− | From that infinite population, a sample of size n is selected. From a sample point i with the grid coordinates | + | From that infinite population, a sample of size ''n'' is selected. From a sample point ''i'' with the grid coordinates <math>x1_i</math>, <math>x2_i</math> the estimated total <math>\hat\tau_i</math> of forest area ''A'' derives from the [[Sampling with unequal selection probabilities#The Hansen-Hurwitz-estimator|Hansen-Hurwitz-estimator]] from |
+ | |||
+ | :<math>\hat\tau_i=\frac{d(x1_ix2_i)}{f(x1_ix2_i)}\,</math> | ||
+ | |||
+ | where <math>f()</math> is the selection probability. Assuming independent random sampling, | ||
+ | |||
+ | :<math>f(x1,x2)=\frac{1}{A}\,</math> | ||
+ | |||
+ | for all sample points. It follows | ||
+ | |||
+ | :<math>\bar\tau_i=A\sum_{x1_i,x2_i){\cap}E_k}\frac{y_k}{a_k}\,</math> | ||
+ | |||
+ | One may interpret this estimator also such that each observation <math>y_k</math> is expanded with the plot-specific expansion factor | ||
+ | |||
+ | :<math>EF_k=\frac{A}{a_k}\,</math> | ||
+ | |||
+ | From a sample of ''n'' randomly selected sample points, the total is eventually estimated as | ||
− | + | :<math>\bar\tau=\frac{1}{n}\sum_{i}^n\bar\tau_i\,</math> | |
− | With this infinite population approach, most properties of plots and other issues in forest inventory sampling can be much better described (including the mirage technique for border plot correction) than with the discrete plot population approach. | + | With this infinite population approach, most properties of plots and other issues in forest inventory sampling can be much better described (including the [[Fixed area plots at the stand boundary#The mirage technique|mirage technique]] for border plot correction) than with the discrete plot population approach. |
==References== | ==References== |
Latest revision as of 12:26, 14 June 2023
Contents |
In forest inventory field sampling, the sampling elements that we select and observe are sample plots. Consequently, the population from which we sample is a population of sample plots.
[edit] Approach 1
An intuitive approach is to imagine this population as the set of sample plots that covers the area of interest completely. This is depicted in Figure 1 for the two examples of square and hexagonal fixed area sample plots. The population consists of a discrete number of plots (see Figure 3, left). The sampling process is to select some of these plots. However, this approach does pose a series of interpretation and analysis problems; it works only for fixed area plots and within the class of fixed area plots only for some plot shapes. For the most frequently used circular plot, for example, this concept can not be applied.
Also, once the forest area of interest is defined (the area sampling frame), it is not clearly defined as well how the plots come to lie. Figure 2 shows one and the same population over which two different grids of plots are overlaid. In one case the square plots fit perfectly into the square area sampling frame, in the other case, there are various border plots with smaller size. The per-plot mean and variance will be different between the two sub-divisions; this is certainly an undesirable property of a population concept that the population parameters are not clearly defined. We conclude that this simple concept of populations of sample plots is not well suited as a model for forest inventory sampling.
[edit] Approach 2
The inclusion zone concept bases on the idea to select points from the area sampling frame around which the sample trees are being selected along a defined plot design. This is a completely different view at the population of sample plots. It is some times called “the infinite population approach” because the population consists of the infinite number of points within the areal sampling frame. Each point possesses a characteristic which is being observed. However, that value is not being observed at the point itself, but it derives from the sample trees which are tallied around that sample point according to the rules that are defined by the plot design.
The inclusion zone concept and the infinite population approach belong together. Each tree k has its inclusion zone \(E_k\). While the size of this inclusion zone can be taken as a measure for the selection probability of that particular tree k, we may also follow a straightforward geometric interpretation that helps understanding the estimation, which is, in fact, based on unequal probability sampling (see also sampling with unequal selection probabilities).
If the tree k has an attribute value \(y_k\) (for example basal area in \(m^2\); or simply number of trees which is \(y_k=1\) for each tree, obviously) we imagine this value distributed evenly over the inclusion zone. Geometrically, that is a disk with the inclusion zone as base area \(a_k\) and a height which is defined by \(y_k\) as
\[d_k=\frac{y_k}{a_k}\,\]
The value of \(d_k\) is constant over the entire inclusion zone and may be interpreted as density value. For the variable number of stems,
\[d_k=\frac{1}{a_k}\,\]
because \(y_k=1\)
The value of one sample element in the infinite population of sample points results from the sum of all such density values d which are present at the particular position \(x1_i\), \(x2_i\):
\[d(x1_i;x2_i)=\sum_{(x1_i,x2_i){\cap}E_k}d_k=\sum_{(x1_i;x2_i){\cap}E_k}\frac{y_k}{a_k}\,\]
That defines eventually the infinite population of points which is indicated in Figure 3, right, as a cloud of points. The population total \(\tau\) is then the integral over the entire area of all inclusion zones
\[\tau=\iint_{x_2,x_1}\,d(x1,x2)\,dx1,dx2\,\]
Observe that this implies to integrate also outside the areal sampling frame of sample points where inclusion zones of border trees are outside the defined inventory area. This leads to the border correction issue which is dealt with in Fixed area plots at the stand boundary.
From that infinite population, a sample of size n is selected. From a sample point i with the grid coordinates \(x1_i\), \(x2_i\) the estimated total \(\hat\tau_i\) of forest area A derives from the Hansen-Hurwitz-estimator from
\[\hat\tau_i=\frac{d(x1_ix2_i)}{f(x1_ix2_i)}\,\]
where \(f()\) is the selection probability. Assuming independent random sampling,
\[f(x1,x2)=\frac{1}{A}\,\]
for all sample points. It follows
\[\bar\tau_i=A\sum_{x1_i,x2_i){\cap}E_k}\frac{y_k}{a_k}\,\]
One may interpret this estimator also such that each observation \(y_k\) is expanded with the plot-specific expansion factor
\[EF_k=\frac{A}{a_k}\,\]
From a sample of n randomly selected sample points, the total is eventually estimated as
\[\bar\tau=\frac{1}{n}\sum_{i}^n\bar\tau_i\,\]
With this infinite population approach, most properties of plots and other issues in forest inventory sampling can be much better described (including the mirage technique for border plot correction) than with the discrete plot population approach.
[edit] References
- ↑ 1.0 1.1 1.2 Kleinn C. und F. Vilčko. 2005. Ein Vergleich von zwei methodischen Konzepten für die Grundgesamtheit von Probeflächen bei Waldinventuren. AFJZ 176(4):68-74.