Basics in R programming

From AWF-Wiki
(Difference between revisions)
Jump to: navigation, search
(Created page with "==Basics in '''R''' programming== The '''R''' programming language is introduced below with some basic examples. Some basic and very frequent routines in '''R''' are grouped...")
 
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==Basics in '''R''' programming==
 
==Basics in '''R''' programming==
The '''R''' programming language is introduced below with some basic  
+
The '''R''' programming language is introduced below with some basic examples. Some basic and very frequent routines in '''R''' are  
examples. Some basic and very frequent routines in '''R''' are  
+
grouped by topic.  
grouped by topic. Nevertheless, some other functions were also  
+
Some other functions are also introduced in [[Starting in R]]. Other advanced and more specific functions and routines will be used in the following labs.
introduced above. Other advanced and more specific functions and  
+
 
routines will be used in the following labs.
+
==Assignments and basic operations in '''R'''==
 +
An assignment is the method to store an object under a given name. Assignments in '''R''' are done by using the "arrow" (<-) or
 +
the equal (=) symbol, even though it's recommended to use the arrow, because some functions use the equal symbol as an argument.
 +
Assignments can be done in two directions. Some examples are shown below.
 +
<source lang="rsplus">a <- 2
 +
b <- 3
 +
4 -> c
 +
a
 +
 
 +
## [1] 2
 +
 
 +
b
 +
 
 +
## [1] 3
 +
 
 +
c
 +
 
 +
## [1] 4
 +
</source>
 +
Operating with objects in '''R''' is very easy. Below some examples with single values:
 +
<source lang="rsplus">(a+b)/c*5
 +
 
 +
##[1] 6.25
 +
</source>
 +
and with vectors.
 +
<source lang="rsplus">v.1 <- v*exp((a+b)/c*5)
 +
v.1
 +
 
 +
## [1] 518 1036 1554 2072 2590 3108 3626 4144 4662
 +
</source>
 +
Note that in the first case the calculus was computed and displayed in the console, but the result was not assigned to any object. In the
 +
second case, the result of the calculus was assigned to a new object named ''v.1''.
 +
In the assignment process it must be considered that objects can be overwritten as many times as we want, and the information
 +
of the original object can be lost, if the required attention is not paid.
 +
 
 +
Assignments can also be done to positions in an existent object. In the following example, the values in the third column of the data frame ''d'' are replaced by the characters ''low'', ''medium'' and ''high'' as follows:
 +
<source lang="rsplus">d$C <- c("low", "medium", "high")
 +
d
 +
 
 +
#  A b C
 +
# 1 1 4 low
 +
# 2 2 5 medium
 +
# 3 3 6 high
 +
</source>
 +
 
 +
The assignments can also be to a previously non-existent column in the dataframe.
 +
<source lang="rsplus">
 +
d$E <- m[,3]
 +
d
 +
 
 +
#  A b C E
 +
# 1 1 4 low 7
 +
# 2 2 5 medium 8
 +
# 3 3 6 high 9
 +
</source>
 +
 
 +
Logical tests can also be done in '''R'''.
 +
<source lang="rsplus">e <- c<a
 +
e
 +
 
 +
## [1] FALSE
 +
</source>
 +
 
 +
==Operations with data modes and object types in '''R'''==
 +
 
 +
Consulting and changing data modes is very easy. Below some examples for checking the data mode of the object ''a'' and change it to as a factor.
 +
<source lang="rsplus">is.numeric(a)
 +
 
 +
## [1] TRUE
 +
 
 +
is.factor(a)
 +
 
 +
## [1] FALSE
 +
 
 +
f <- as.factor(a)
 +
mode(f)
 +
 
 +
## [1] "numeric"
 +
 
 +
mode(e)
 +
 
 +
## [1] "logical"
 +
</source>
 +
 
 +
An example of transformation from matrix to dataframe was already shown in [[Types of Objects in R]]. Another option to do the same is shown below. The type and the structure (of the ''d'' object) is also shown below.
 +
<source lang="rsplus">d.1 <- as.data.frame(m)
 +
class(d.1)
 +
 
 +
## [1] "data.frame"
 +
 
 +
str(d)
 +
 
 +
## 'data.frame': 3 obs. of 4 variables:
 +
##  $ A: num  1 2 3
 +
##  $ B: num  4 5 6
 +
##  $ C: chr  "low" "medium" "high"
 +
##  $ E: num  7 8 9
 +
</source>
 +
 
 +
==Importing and exporting  data in '''R'''==
 +
 
 +
'''R''' allows data importing from many different source formats. The most common used functions to import data are the ''read'' functions. The following code can be used to import a .csv table named "forest-data.csv" from the folder "my-working-folder". The data is stored in an object named ''data''.
 +
<source lang="rsplus">
 +
data <- read.csv("C:/Program Files/... /my-working-folder/forest-data.csv")
 +
</source>
 +
Note that ''/'' or ''\\'' instead of ''\'' is used in '''R''' to define the the file path in your computer.
 +
 
 +
A more convenient way of working is defining first the working directory from where the data will be imported and where the result can be stored. The actual working directory can be consulted by ''getwd()'', and the objects contained there by ''dir()''. The working directory can be defined as described below.
 +
<source lang="rsplus">
 +
setwd("D:/Data/... /R-course/")
 +
</source>
 +
Once defined the working directory, the file can be imported without specifying the file path as follows:
 +
<source lang="rsplus">
 +
data <- read.csv("forest-data.csv")
 +
</source>
 +
All the objects created or imported to '''R''' are temporary stored in the '''R'''-workspace. Objects in the workspace can be removed by  using the ''rm()'' function. All objects in the workspace are usually lost after finishing the '''R''' session. The easiest option to store
 +
permanently an object in the working directory is by using the ''write()'' functions. The code to save as a .csv file the object ''d''
 +
is shown below.
 +
<source lang="rsplus">
 +
write.csv(d, "Data-saved.csv")
 +
</source>
 +
 
 +
==Functions in '''''R'''''==
 +
 
 +
In this article and in [[Starting in R]] the function objects were already introduced, and some basic functions were applied. That was the case of the ''c()'' function which concatenates several elements in a vector, the ''matrix()'' function which transforms a vector in a matrix, the ''which()''function which reports the elements where a given logical prove is TRUE, etc..
 +
 
 +
Functions in '''R''' are very easy to apply because the programming
 +
is the same in all cases. Applying a function is as easy as to type the
 +
name of the function immediately followed (without space) by circular
 +
brackets. Inside the circular brackets, the arguments of the function
 +
are specified. A imaginary function named ''ResourcesAssessment()''
 +
could be applied by using the following code:
 +
<source lang="rsplus">ResourcesAssessment(Arg_1, Arg_2, ..., Arg_i, ..., Arg_n)
 +
</source>
 +
where ''Arg_i'' are the n arguments which defineing the function
 +
options.
 +
 
 +
Not all arguments must be specified to apply the function, as some arguments have default settings that are applied in the case that the
 +
argument is not specified. Some arguments requires a single element to be entered, but also vectors or even matrices, dataframes or other type of objects can be arguments in a function, as shown above. Most of the cases the argument must be entered by using its name. For instance, the argument ''add=TRUE'' controls in some graphical functions whether the current plot is produced in a new chart or incorporated to the previous one.
 +
 
 +
A question that may arise at this point is the list of the available functions in '''R''', or even the total number of it. It is a
 +
difficult question to answer, as the number of functions is continuously increasing. There is a set of functions included with the basic
 +
'''R''' software (core), but the number of functions can be increased by installing new packages. New packages can be downloaded from the URL
 +
address of '''R''', or being installed directly from '''RStudio''' in the "Packages" tab of the Graphical output window. Even though it is
 +
enough with installing once the new packages, packages not included in the core must be activated each time to be used in a new '''R'''
 +
session by using the ''library()'' function.
 +
 
 +
Even though the number of available functions is huge, should we be interested in create a new one. It can be easily done in '''R''' but
 +
it is far of the scope of this introductory lab.
 +
 
 +
[[Category:Introduction to R]]

Latest revision as of 09:54, 23 April 2015

Contents

[edit] Basics in R programming

The R programming language is introduced below with some basic examples. Some basic and very frequent routines in R are grouped by topic. Some other functions are also introduced in Starting in R. Other advanced and more specific functions and routines will be used in the following labs.

[edit] Assignments and basic operations in R

An assignment is the method to store an object under a given name. Assignments in R are done by using the "arrow" (<-) or the equal (=) symbol, even though it's recommended to use the arrow, because some functions use the equal symbol as an argument. Assignments can be done in two directions. Some examples are shown below.

a <- 2
b <- 3
4 -> c
a
 
## [1] 2
 
b
 
## [1] 3
 
c
 
## [1] 4

Operating with objects in R is very easy. Below some examples with single values:

(a+b)/c*5
 
##[1] 6.25

and with vectors.

v.1 <- v*exp((a+b)/c*5)
v.1
 
## [1] 518 1036 1554 2072 2590 3108 3626 4144 4662

Note that in the first case the calculus was computed and displayed in the console, but the result was not assigned to any object. In the second case, the result of the calculus was assigned to a new object named v.1. In the assignment process it must be considered that objects can be overwritten as many times as we want, and the information of the original object can be lost, if the required attention is not paid.

Assignments can also be done to positions in an existent object. In the following example, the values in the third column of the data frame d are replaced by the characters low, medium and high as follows:

d$C <- c("low", "medium", "high")
d
 
#   A b		C
# 1 1 4		low
# 2 2 5		medium
# 3 3 6		high

The assignments can also be to a previously non-existent column in the dataframe.

d$E <- m[,3]
d
 
#   A b		C	E
# 1 1 4		low	7
# 2 2 5		medium	8
# 3 3 6		high	9

Logical tests can also be done in R.

e <- c<a
e
 
## [1] FALSE

[edit] Operations with data modes and object types in R

Consulting and changing data modes is very easy. Below some examples for checking the data mode of the object a and change it to as a factor.

is.numeric(a)
 
## [1] TRUE
 
is.factor(a)
 
## [1] FALSE
 
f <- as.factor(a)
mode(f)
 
## [1] "numeric"
 
mode(e)
 
## [1] "logical"

An example of transformation from matrix to dataframe was already shown in Types of Objects in R. Another option to do the same is shown below. The type and the structure (of the d object) is also shown below.

d.1 <- as.data.frame(m)
class(d.1)
 
## [1] "data.frame"
 
str(d)
 
## 'data.frame': 3 obs. of 4 variables:
##  $ A: num  1 2 3
##  $ B: num  4 5 6
##  $ C: chr  "low" "medium" "high"
##  $ E: num  7 8 9

[edit] Importing and exporting data in R

R allows data importing from many different source formats. The most common used functions to import data are the read functions. The following code can be used to import a .csv table named "forest-data.csv" from the folder "my-working-folder". The data is stored in an object named data.

data <- read.csv("C:/Program Files/... /my-working-folder/forest-data.csv")

Note that / or \\ instead of \ is used in R to define the the file path in your computer.

A more convenient way of working is defining first the working directory from where the data will be imported and where the result can be stored. The actual working directory can be consulted by getwd(), and the objects contained there by dir(). The working directory can be defined as described below.

setwd("D:/Data/... /R-course/")

Once defined the working directory, the file can be imported without specifying the file path as follows:

data <- read.csv("forest-data.csv")

All the objects created or imported to R are temporary stored in the R-workspace. Objects in the workspace can be removed by using the rm() function. All objects in the workspace are usually lost after finishing the R session. The easiest option to store permanently an object in the working directory is by using the write() functions. The code to save as a .csv file the object d is shown below.

write.csv(d, "Data-saved.csv")

[edit] Functions in R

In this article and in Starting in R the function objects were already introduced, and some basic functions were applied. That was the case of the c() function which concatenates several elements in a vector, the matrix() function which transforms a vector in a matrix, the which()function which reports the elements where a given logical prove is TRUE, etc..

Functions in R are very easy to apply because the programming is the same in all cases. Applying a function is as easy as to type the name of the function immediately followed (without space) by circular brackets. Inside the circular brackets, the arguments of the function are specified. A imaginary function named ResourcesAssessment() could be applied by using the following code:

ResourcesAssessment(Arg_1, Arg_2, ..., Arg_i, ..., Arg_n)

where Arg_i are the n arguments which defineing the function options.

Not all arguments must be specified to apply the function, as some arguments have default settings that are applied in the case that the argument is not specified. Some arguments requires a single element to be entered, but also vectors or even matrices, dataframes or other type of objects can be arguments in a function, as shown above. Most of the cases the argument must be entered by using its name. For instance, the argument add=TRUE controls in some graphical functions whether the current plot is produced in a new chart or incorporated to the previous one.

A question that may arise at this point is the list of the available functions in R, or even the total number of it. It is a difficult question to answer, as the number of functions is continuously increasing. There is a set of functions included with the basic R software (core), but the number of functions can be increased by installing new packages. New packages can be downloaded from the URL address of R, or being installed directly from RStudio in the "Packages" tab of the Graphical output window. Even though it is enough with installing once the new packages, packages not included in the core must be activated each time to be used in a new R session by using the library() function.

Even though the number of available functions is huge, should we be interested in create a new one. It can be easily done in R but it is far of the scope of this introductory lab.

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export