Resource assessment exercises: loading data

From AWF-Wiki
Jump to: navigation, search
This article is part of the Resource assessment exercises. See the category page for a (chronological) table of contents.

In this section we will provide a brief recap of the basics of sample survey statistics. Our focus will be on surveys conducted in forests. Obviously, forests are made up by trees. However, when we conduct a sampling survey in a forest, we usually do not sample individual trees, but areas. For the time being we will, nevertheless, assume that each tree represents one sampling unit to simplify our derivations. In later sections we will relax this assumption and consider situations more common in natural resource assessments (see the category on response designs).

Loading the data

We start with loading the trees dataset into the R workspace. The data is available as a comma-separated (CSV) file, which can be read into using the read.csv() or read.table() function. Note, on Microsoft Windows the path to the file trees.csv looks different.

trees <- read.csv(file = "./data/trees.csv")

R can read many more file types. The package foreign provides further facilities (e.g., reading Microsoft Excel *.xls(s) files into R directly). However, we recommend to export data in file types that can be used by other software packages. CSV is usually a good choice and Excel can also export sheets into this format.

The function str() (structure) provides a compact overview of the data.

str(trees) 

## 'data.frame':    30000 obs. of  5 variables:
##  $ dbh    : int  52 10 14 12 15 39 12 12 10 35 ...
##  $ stratum: int  2 1 1 2 1 2 1 1 1 2 ...
##  $ species: int  2 2 2 1 2 2 2 2 2 2 ...
##  $ height : num  20.38 7.64 11.29 8.83 10.6 ...
##  $ ab     : num  1.1505 0.015 0.0442 0.0252 0.0477 ...


info.png What the function str() does
The function str(data.object) simply prints the structure of a data object, e.g., a data.frame. It provides the number of observations (rows) and variables (columns), as well as the mode of each variable, e.g., int for integer, factor for factor levels, etc. See help(str).

The data.frame trees consists of 30,000 observations (rows) and 5 variables (columns). We will have a look at only ten trees, first. Here is a list of their DBHs.

\(12,19,14,23,29,16,44,48,27,33\)

trees10 <- data.frame(idx=1:10)
trees10$dbg <- c(12, 19, 14, 23, 29, 16, 44, 48, 27, 33)


info.png What the functin data.frame() does
The function data.frame() simply creates a data.frame. The code data.frame(1:10), for example, creates a data.frame with 10 rows and one column. Typing 1:10 would result in a vector.

For the time being, these ten trees will serve as our example population. We will return to the simulated forest later on.

Related articles

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export