Adaptive cluster sampling
Introduction
There are target objects in forest inventory which are rare, such as rare tree or shrub species. They are sparsely distributed over the population of interest and using the usual sampling designs we will face the situation that most of the plots are empty and then there are some few plots with the objects of interest. Here, one of the problematic features (to the opinion of the senior author) becomes obvious: in design-based statistical sampling we do only observe and record what is there on the sample plot. Nothing else. Outside the observation plot we pretend to be blind; and so we do on the some times long way to the sample plot! It is hypothesized that this is a waste of resources and that there must be sound ways of including such additional observations / information into a design-based estimation.
Sampling for rare events is a science in itself and there are whole textbooks about that specific topic. Often, the simplest solution is to increase sample size in order to increase the probability to encounter the rare objects; however, this increases travel cost and labor cost as well.
If it is known or suspected for some reasons that the rare species forms clusters or groups within the population of interest, then, one may be interested to establish plots around the sampled plot once a sampled plots contains a rare element; because then, one would expect more elements around that plot. The sampling strategy, that implies enlarging the plot once a target element is found on the initial plot, is called Adaptive Cluster Sampling (ACS). In general, adaptive samplings are sampling strategies that adapt to specific situations. That means, that the final design that is implemented (in the field) is not completely predictable but depends also on what is being found out there. This conditional adaptation of the design makes estimation difficult, because the selection probability is then obviously a conditional probability. And the selection probability of a specific element depends also on the proximity of other elements,
Just like cluster sampling, neither adaptive cluster sampling is, strictly spoken, a sampling design by its own, it is a variation of response design in which the plot size adapts to the specific situation found in the field. For this adaptation, however, clear rules need to be defined.
Adaptive cluster sampling has been developed and introduced by Thompson (1992); he is also the author of a textbook on the more general approach of adaptive sampling (Thompson and Seber 1996). Our presentation of adaptive cluster sampling here follows his publications. Adaptive cluster sampling is relatively frequently applied in research studies; application in “production forest inventories” is very rare, however, if any!
General procedure
In Step 1, a random sample of n plots is selected; this is some times called the initial sample or the seed sample. In Step 2, for each initial plot, we determine whether the target element is there or not; or in general terms: whether the specified condition is fulfilled or not. If the condition is fulfilled (for example: there is at least one of the target species on that initial plot), then all its neighboring plots are also observed. Then again, neighbors of these new plots (that are now sub-plots, actually) will be observed if the plots fulfill the specified condition. This procedure is continued until no more plots are found at the periphery of the cluster of sub-plots that fulfill the condition. By this procedure, the plot design adapts to the situation encountered in the field. Clusters of sub-plots are generated by this procedure which are irregular in shape and unequal in size. It is the occurrence of the objects of interest that defines the final shape and size of the clusters. However, the number of clusters is determined by the initial sample (if not neighboring clusters grow together).
The above picture shows an application of standard adaptive cluster sampling as developed by Thompson. The left graph shows the population of clustered and relatively rare events, thus the population of interest. In the right graph, the adaptive clustering process is depicted: the red squares are the randomly selected n initial plots. The green depicts the clusters that are eventually expanded according to a specific rule, while the blue plots are the sets of plots surrounding the initial sampled plot and satisfying the specific rule.
Terminology
Some terms need to be defined in the context of adaptive cluster sampling:
- Cluster: a set of plots around the sampled plot, which is the final result of the selection along the defined adaptation procedure.
- Network: a subset of all plots within a cluster such that if any plot of the network is selected, all other plots of this network will enter into the sample.
- Edge: neighboring plots of a network. Selecting an edge plot does not make an additional plot enter the sample. However, if a network is selected to be in the sample, its edge plots will enter the sample as a ring around the network.
- If there is a plot that does not fulfill the specific condition, it is defined as a network of size 1. This implies that the population is composed of networks, and we can specify the selection probability for each network.
sorry: |
This section is still under construction! This article was last modified on 12/10/2010. If you have comments please use the Discussion page or contribute to the article! |