Resource assessment exercises: loading data

Revision as of 10:37, 10 May 2014

sorry:

This section is still under construction! This article was last modified on 05/10/2014. If you have comments please use the Discussion page or contribute to the article!

This article is part of the Resource assessment exercises. See the category page for a (chronological) table of contents.

In this section we will provide a brief recap of the basics of sample survey statistics. Our focus will be on surveys conducted in forests. Obviously, forests are made up by trees. However, when we conduct a sampling survey in a forest, we usually do not sample individual trees, but areas. For the time being we will, nevertheless, assume that each tree represents one sampling unit to simplify our derivations. In later sections we will relax this assumption and consider situations more common in natural resource assessments (see Section [sec:rd]).

Loading the data

We start with loading the trees dataset into the workspace. The data is available as a comma-separated (CSV) file, which can be read into using the read.csv() or read.table() function. Note, on Microsoft Windows the path to the file trees.csv looks different.

The function str() can read many more file types. The package foreign provides further facilities (e.g., reading Microsoft Excel *.xls(s) files into directly). However, we recommend to export data in file types that can be used by other software packages. CSV is usually a good choice and Excel can also export sheets into this format.

The function str() (structure) provides a compact overview of the data.

## 'data.frame':    30000 obs. of  5 variables:
##  $ dbh    : int  52 10 14 12 15 39 12 12 10 35 ...
##  $ stratum: int  2 1 1 2 1 2 1 1 1 2 ...
##  $ species: int  2 2 2 1 2 2 2 2 2 2 ...
##  $ height : num  20.38 7.64 11.29 8.83 10.6 ...
##  $ ab     : num  1.1505 0.015 0.0442 0.0252 0.0477 ...

The data.frame trees consists of 30,000 observations (rows) and 5 variables (columns). We will have a look at only ten trees, first. Here is a list of their DBHs.

\(12,19,14,23,29,16,44,48,27,33\)

trees10 <- data.frame(idx=1:10)
trees10$dbg <- c(12, 19, 14, 23, 29, 16, 44, 48, 27, 33)

For the time being, these ten trees will serve as our example population. We will return to the simulated forest later on.

@@ Line 1: / Line 1: @@
 {{construction}}
+: ''This article is part of the '''Resource assessment exercises'''. See the [[:category:Resource assessment exercises 2014|category page]] for a (chronological) table of contents.
+In this section we will provide a brief recap of the basics of sample survey statistics. Our focus will be on surveys conducted in forests. Obviously, forests are made up by trees. However, when we conduct a sampling survey in a forest, we usually do not sample individual trees, but areas. For the time being we will, nevertheless, assume that each tree represents one sampling unit to simplify our derivations. In later sections we will relax this assumption and consider situations more common in natural resource assessments (see Section [sec:rd]).
+== Loading the data ==
+We start with loading the <code>trees</code> dataset into the workspace. The data is available as a comma-separated (CSV) file, which can be read into using the <code>read.csv()</code> or <code>read.table()</code> function. Note, on Microsoft Windows the path to the file <code>trees.csv</code> looks different.
+The function <code>str()</code> can read many more file types. The package <code>foreign</code> provides further facilities (e.g., reading Microsoft Excel <code>*.xls(s)</code> files into directly). However, we recommend to export data in file types that can be used by other software packages. CSV is usually a good choice and Excel can also export sheets into this format.
+The function <code>str()</code> ('''str'''ucture) provides a compact overview of the data.
+<pre>## 'data.frame':    30000 obs. of  5 variables:
+##  $ dbh    : int  52 10 14 12 15 39 12 12 10 35 ...
+##  $ stratum: int  2 1 1 2 1 2 1 1 1 2 ...
+##  $ species: int  2 2 2 1 2 2 2 2 2 2 ...
+##  $ height : num  20.38 7.64 11.29 8.83 10.6 ...
+##  $ ab     : num  1.1505 0.015 0.0442 0.0252 0.0477 ...</pre>
+The <code>data.frame</code> <code>trees</code> consists of 30,000 observations (rows) and 5 variables (columns). We will have a look at only ten trees, first. Here is a list of their DBHs.
+<math>12,19,14,23,29,16,44,48,27,33</math>
+ trees10 <- data.frame(idx=1:10)
+ trees10$dbg <- c(12, 19, 14, 23, 29, 16, 44, 48, 27, 33)
+For the time being, these ten trees will serve as our example population. We will return to the simulated forest later on.
 [[category:Resource assessment basics in R (2014)]]

Resource assessment exercises: loading data

Revision as of 10:37, 10 May 2014

Loading the data

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Development

Toolbox

Print/export