Basics in R programming
Contents |
Basics in R programming
The R programming language is introduced below with some basic examples. Some basic and very frequent routines in R are grouped by topic. Some other functions are also introduced in Starting in R. Other advanced and more specific functions and routines will be used in the following labs.
Assignments and basic operations in R
An assignment is the method to store an object under a given name. Assignments in R are done by using the "arrow" (<-) or the equal (=) symbol, even though it can be recommended to use the arrow because some functions use the equal symbol as an argument. Assignments can be done in two directions. Some examples are shown below.
a <- 2 b <- 3 4 -> c a
- [1] 2
b
- [1] 3
c
- [1] 4
Operating with objects in R is very easy. Below some examples with single values:
(a+b)/c*5
- [1] 6.25
and with vectors.
v.1 <- v*exp((a+b)/c*5) v.1
- [1] 518 1036 1554 2072 2590 3108 3626 4144 4662
Note that in the first case the calculus was computed and displayed in the console, but the result was not assigned to any object. In the second case, the result of the calculus was assigned to a new object named v.1. In the assignment process it must be considered that objects can be overwritten as many times as we want, and the information of the original object can be lost if the required attention is not played.
Assignments can also be done to positions in an existent object. In the following example, the values in the third column of the data frame d are replaced by the characters low, medium and high as follows:
d$C <- c("low", "medium", "high") d # A b C # 1 1 4 low # 2 2 5 medium # 3 3 6 high The assignments can also be to a previously non-existent column in the dataframe. d$E <- m[,3] d
- A b C E
- 1 1 4 low 7
- 2 2 5 medium 8
- 3 3 6 high 9
Logical tests can also be done in R.
e <- c<a e
- [1] FALSE
Operations with data modes and object types in R
Consulting and changing data modes is very easy. Below some examples for checking the data mode of the object a and change it to as a factor.
is.numeric(a)
- [1] TRUE
is.factor(a)
- [1] FALSE
f <- as.factor(a) mode(f)
- [1] "numeric"
mode(e)
- [1] "logical"
An example of transformation from matrix to dataframe was already shown in Types of Objects in R. Another option to do the same is shown below. The type and the structure (of the d object) is also shown below.
d.1 <- as.data.frame(m) class(d.1)
- [1] "data.frame"
str(d)
- 'data.frame': 3 obs. of 4 variables:
- $ A: num 1 2 3 ## $ B: num 4 5 6
- $ C: chr "low" "medium" "high" ## $ E: num 7 8 9
Importing and exporting data in R
R allows data importing from many different source formats. The most common used functions to import data are the read functions. The following code can be used to import a .csv table named "forest-data.csv" from the folder "my-working-folder". The data is stored in an object named data.
data <- read.csv("C:/Program Files/... /my-working-folder/forest-data.csv")
Note that / or \\ instead of \ is used in R to define the the file path in your computer.
A more convenient way of working is defining first the working directory from where the data will be imported and where the result can be stored. The actual working directory can be consulted by getwd(), and the objects contained there by dir(). The working directory can be defined as described below.
setwd("D:/Data/... /R-course/")
Once defined the working directory, the file can be imported without specifying the file path as follows:
data <- read.csv("forest-data.csv")
All the objects created or imported to R are temporary stored in the R-workspace. Objects in the workspace can be removed by using the rm() function. All objects in the workspace are usually lost after finishing the R session. The easiest option to store permanently an object in the working directory is by using the write() functions. The code to save as a .csv file the object d is shown below.
write.csv(d, "Data-saved.csv")
Functions in R
In this lab the function objects were already introduced, and some basic functions were applied. That was the case of the c() function which concatenates several elements in a vector, the matrix() function which transforms a vector in a matrix, the which() function which reports the elements where a given logical prove is TRUE, etc..
Functions in R are very easy to apply because the programming is the same in all cases. Applying a function is as easy as to type the name of the function immediately followed (without space) by circular brackets. Inside the circular brackets, the arguments of the function are specified. A imaginary function named ResourcesAssessment() could be applied by using the following code: <<eval=FALSE>>= ResourcesAssessment(Arg_1, Arg_2, ..., Arg_i, ..., Arg_n) @ where Arg_i are the n arguments which defineing the function options.
Not all arguments must be specified to apply the function, as some arguments have default settings that are applied in the case that the argument is not specified. Some arguments requires a single element to be entered, but also vectors or even matrices, dataframes or other type of objects can be arguments in a function, as shown above. Most of the cases the argument must be entered by using its name. For instance, the argument add=TRUE controls in some graphical functions whether the current plot is produced in a new chart or incorporated to the previous one.
A question that may arise at this point is the list of the available functions in R, or even the total number of it. It is a difficult question to answer, as the number of functions is continuously increasing. There is a set of functions included with the basic R software (core), but the number of functions can be increased by installing new packages. New packages can be downloaded from the URL address of R, or being installed directly from RStudio in the "Packages" tab of the Graphical output window. Even though it is enough with installing once the new packages, packages not included in the core must be activated each time to be used in a new R session by using the library() function.
Even though the number of available functions is huge, should we be interested in create a new one. It can be easily done in R but it is far of the scope of this introductory lab.