Resource assessment exercises: loading data
(→Loading the data) |
|||
Line 1: | Line 1: | ||
− | |||
: ''This article is part of the '''Resource assessment exercises'''. See the [[:category:Resource assessment exercises 2014|category page]] for a (chronological) table of contents. | : ''This article is part of the '''Resource assessment exercises'''. See the [[:category:Resource assessment exercises 2014|category page]] for a (chronological) table of contents. | ||
Line 44: | Line 43: | ||
For the time being, these ten trees will serve as our example population. We will return to the simulated forest later on. | For the time being, these ten trees will serve as our example population. We will return to the simulated forest later on. | ||
+ | |||
+ | ==Related articles== | ||
+ | * Previous article: [[Introduction to resource assessment exercises]] | ||
+ | * Next article: [[Resource assessment exercises: mean, variance and standard deviation|mean, variance and standard deviation]] | ||
[[category:Resource assessment basics in R (2014)]] | [[category:Resource assessment basics in R (2014)]] |
Revision as of 10:33, 10 May 2014
- This article is part of the Resource assessment exercises. See the category page for a (chronological) table of contents.
In this section we will provide a brief recap of the basics of sample survey statistics. Our focus will be on surveys conducted in forests. Obviously, forests are made up by trees. However, when we conduct a sampling survey in a forest, we usually do not sample individual trees, but areas. For the time being we will, nevertheless, assume that each tree represents one sampling unit to simplify our derivations. In later sections we will relax this assumption and consider situations more common in natural resource assessments (see the category on response designs).
Loading the data
We start with loading the trees
dataset into the R workspace. The data is available as a comma-separated (CSV) file, which can be read into using the read.csv()
or read.table()
function. Note, on Microsoft Windows the path to the file trees.csv
looks different.
trees <- read.csv(file = "./data/trees.csv")
R can read many more file types. The package foreign
provides further facilities (e.g., reading Microsoft Excel *.xls(s)
files into R directly). However, we recommend to export data in file types that can be used by other software packages. CSV is usually a good choice and Excel can also export sheets into this format.
The function str()
(structure) provides a compact overview of the data.
str(trees) ## 'data.frame': 30000 obs. of 5 variables: ## $ dbh : int 52 10 14 12 15 39 12 12 10 35 ... ## $ stratum: int 2 1 1 2 1 2 1 1 1 2 ... ## $ species: int 2 2 2 1 2 2 2 2 2 2 ... ## $ height : num 20.38 7.64 11.29 8.83 10.6 ... ## $ ab : num 1.1505 0.015 0.0442 0.0252 0.0477 ...
- What the function
str()
does - The function
str(data.object)
simply prints the structure of a data object, e.g., adata.frame
. It provides the number of observations (rows) and variables (columns), as well as the mode of each variable, e.g.,int
for integer,factor
for factor levels, etc. Seehelp(str)
.
The data.frame
trees
consists of 30,000 observations (rows) and 5 variables (columns). We will have a look at only ten trees, first. Here is a list of their DBHs.
\(12,19,14,23,29,16,44,48,27,33\)
trees10 <- data.frame(idx=1:10) trees10$dbg <- c(12, 19, 14, 23, 29, 16, 44, 48, 27, 33)
- What the functin
data.frame()
does - The function
data.frame()
simply creates adata.frame
. The codedata.frame(1:10)
, for example, creates adata.frame
with 10 rows and one column. Typing1:10
would result in a vector.
For the time being, these ten trees will serve as our example population. We will return to the simulated forest later on.
Related articles
- Previous article: Introduction to resource assessment exercises
- Next article: mean, variance and standard deviation