|
|
Line 10: |
Line 10: |
| which contains the 9 integer numbers, from 1 to 9. This vector can be created by using the ''c()'' function, where the arguments are the numbers to be concatenate in a vector. The "arrow" ''<-'' assigns | | which contains the 9 integer numbers, from 1 to 9. This vector can be created by using the ''c()'' function, where the arguments are the numbers to be concatenate in a vector. The "arrow" ''<-'' assigns |
| the vector to the object named ''v''. | | the vector to the object named ''v''. |
− |
| |
− | <span style="color:#CC2EFA">v <- c(1,2,3,4,5,6,7,8,9) </span>
| |
− | v
| |
− | ## [1] 123456789
| |
− |
| |
− | The third element of the vector can be accessed by indicating its position inside square brackets after the name of the object as follows:
| |
− |
| |
− | <span style="color:#CC2EFA">v[3] </span>
| |
− | ## [1] 3
| |
− |
| |
− | an interval of consecutive elements, for instance the elements from the
| |
− | second to the fourth can be accessed by:
| |
− | <<>>=
| |
− | v[3:4]
| |
− | @
| |
− | non-consecutive elements, like the second and the seventh element can be
| |
− | accessed by:
| |
− | <<>>=
| |
− | v[c(2,7)]
| |
− | @
| |
− | all elements with a value bigger or equal than 6 can be accessed by
| |
− | using the {\tt which()} function inside the square brackets as follows:
| |
− | <<>>=
| |
− | v[which(v>=6)]
| |
− | @
| |
− |
| |
− |
| |
− | \item \textbf{\textit{Matrix.}} It is an object with two dimensions
| |
− | (rows and columns). All columns in a matrix must have the same data mode
| |
− | (numeric, character, factor, etc.). To demonstrate the way to access to
| |
− | the data, lets create a matrix named {\tt m} based on the vector {\tt
| |
− | v} by using the {\tt matrix()} function, where in addition to the
| |
− | vector, the number of rows and columns ({\tt nrow= , ncol=}) of the
| |
− | final matrix must be specified as an arguments.
| |
− | <<>>=
| |
− | m <- matrix(v, nrow=3, ncol=3)
| |
− | m
| |
− | @
| |
− | The element in the second row and third column of the matrix can be
| |
− | accessed by:
| |
− | <<>>=
| |
− | m[2,3]
| |
− | @
| |
− | all elements in the second column can be accessed by:
| |
− | <<>>=
| |
− | m[,2]
| |
− | @
| |
− | and in the third row by:
| |
− | <<>>=
| |
− | m[3,]
| |
− | @
| |
− | Like for the case of the vectors, consecutive elements and
| |
− | non-consecutive elements can be accessed. For instance, the following
| |
− | code access to the elements in the second and third row, and in the
| |
− | first and third column:
| |
− | <<>>=
| |
− | m[2:3,c(1,3)]
| |
− | @
| |
− | The dimensions of the resulting object can be a vector or a matrix,
| |
− | depending on whether the data is accessed by row, column, or both.
| |
− |
| |
− | \item \textbf{\textit{Array.}} An array is a similar object to a matrix,
| |
− | but with more than two dimensions. The way of accessing to the data is
| |
− | the same than for matrices, but with as many arguments inside the square
| |
− | brackets as dimensions of the array.
| |
− |
| |
− | \item \textbf{\textit{Dataframe.}} Dataframes has the same dimensions
| |
− | than matrices, but allow more flexibility as the elements in different
| |
− | columns can have different data modes. Dataframes are the equivalent in
| |
− | \textsf{R} to SAS or SPSS datasets. Elements in dataframes can be
| |
− | accessed by using the correspondent position (row and column) in the
| |
− | same way than matrices, or by using the name of the column. To show the
| |
− | differences, lets transform the matrix {\tt m} in a dataframe {\tt d} as
| |
− | follows:
| |
− | <<>>=
| |
− | d <- data.frame(m)
| |
− | d
| |
− | @
| |
− |
| |
− | Now {\tt d} is a dataframe, with the same elements than the matrix {\tt
| |
− | m} but it can be seen that the names of the columns have changed. Now
| |
− | the names of the columns are:
| |
− | <<>>=
| |
− | colnames(d)
| |
− | @
| |
− | The second and third elements of the first column can be accessed in
| |
− | either of the following ways:
| |
− | <<>>=
| |
− | d[2:3,1]
| |
− | d$X1[2:3]
| |
− | @
| |
− | The names of the columns can be modified as follows:
| |
− | <<>>=
| |
− | colnames(d) <- c("A", "b", "C")
| |
− | d
| |
− | @
| |
− | \textsf{R} distinguish between capital and small letters, that is why
| |
− | the second column in the {\tt d} dataframe must be accessed by:
| |
− | <<>>=
| |
− | d$b
| |
− | @
| |
− | instead of:
| |
− | <<>>=
| |
− | d$B
| |
− | @
| |
− |
| |
− | \item \textbf{\textit{Lists.}} Lists in \textsf{R} are ordered
| |
− | collection of objects. They allows to compile in the same object a
| |
− | variety of different types of objects. As an example, below an list
| |
− | object named {\tt l} will be created with the vector {\tt v} in the
| |
− | first position, the matrix {\tt m} in the second position and the
| |
− | dataframe {\tt d} in the third position.
| |
− | <<>>=
| |
− | l <- list(first=v, second=m, third=d)
| |
− | l
| |
− | @
| |
− | The information stored in the second position of the list can be
| |
− | accessed in the following ways:
| |
− | <<>>=
| |
− | l$second
| |
− | l[[2]]
| |
− | @
| |
− | The rules learned above can also be used to access to an element inside
| |
− | an object contained in a given position in a list.
| |
− |
| |
− | \item \textbf{\textit{Functions.}} Functions are the objects in
| |
− | \textsf{R} in which algorithms are stored and executed. The basic
| |
− | components of the functions are the environment (the algorithm itself),
| |
− | and the arguments of the function. \textsf{R} has many functions
| |
− | implemented both in the core program and in the extensions, but custom
| |
− | functions can also be created. An example of a function object is the
| |
− | {\tt mean()} function, which calculates the arithmetic mean of a
| |
− | collection of values, and where the basic argument is the vector of
| |
− | values. An example of the application of the {\tt mean()} function is
| |
− | shown below, where the mean of all values of the vector {\tt v} is
| |
− | calculated.
| |
− | <<>>=
| |
− | mean(v)
| |
− | @
| |
− |
| |
− |
| |
− | \end{itemize}
| |
− |
| |
− | There is many other types of objects, the listed above are only a short
| |
− | list of the most frequent. As the way to access to the information
| |
− | depends on the type of object, it is necessary to have always in mind
| |
− | the type of object that we are working with. The type of object can be
| |
− | known by using the {\tt class()} function. For other objects not listed
| |
− | in this document, the {\tt str()} function provides information about
| |
− | how to access to the elements inside the object.
| |
− |
| |
− | \subsection{Types of data modes in \textsf{R}}
| |
− |
| |
− | Data modes in \textsf{R} refers to the type of elements stored in each
| |
− | position of the objects.
| |
− |
| |
− |
| |
− | \begin{itemize}
| |
− | \item \textbf{\textit{Numerical.}} Real or integer numbers. There is a
| |
− | different data mode for imaginary numbers (\textbf{\textit{complex}}).
| |
− | \item \textbf{\textit{Character.}} Strings of text values. Are always
| |
− | displayed inside quotes, and must be entered in that way.
| |
− | \item \textbf{\textit{Factor.}} Are variables which take a limited
| |
− | number of different values. In statistics, are usually refer as
| |
− | "categorical variables". Some statistical procedures requires some
| |
− | variables being factors. The way in which are displayed in the console
| |
− | is similar to character (inside quotes).
| |
− | \item \textbf{\textit{Logical.}} {\tt TRUE/FALSE}
| |
− | \end{itemize}
| |
− | Data mode can be consulted by using the {\tt mode()} function.
| |
− | <<>>=
| |
− | mode(v)
| |
− | mode(colnames(d))
| |
− | @
| |
It is an object with only one dimension. To show how to access to the data, lets create an object named v
which contains the 9 integer numbers, from 1 to 9. This vector can be created by using the c() function, where the arguments are the numbers to be concatenate in a vector. The "arrow" <- assigns
the vector to the object named v.