Introduction to R

From AWF-Wiki
Revision as of 17:34, 3 December 2014 by Aknop (Talk | contribs)

Jump to: navigation, search

About this course

In this Lab we will introduce R, and explain some basic things to start working with it. This lab does not pretend to be a exhaustive course in R, but some basic knowledge about the software used and the programming language is required to understand the exercises of the labs. The objectives of the labs are to practice the concepts learned in the theoretic sessions for a better understanding, and also to get familiar with forest monitoring data management and analysis.

About R

R is an integrated set of software for data manipulation, calculus and graphics generation. Among other features, R has:

  1. i) an effective system for data manipulation and storage,
  2. ii) efficient operators for indexed variables (arrays), matrices in
  3. particular,
  4. iii) powerful graphical possibilities for data analysis,
  5. iv) a well developed, simple and effective programming language.

Just to show the wide range of things that can be done in R, the documentation that you are reading now was produced with it, including the text editing with LaTeX. R is a widely used software among statisticians and data miners for data analysis.

Advantages of R

R can be defined as an open source implementation of the programming language S, which was developed by Rick Becker, John Chambers and Allan Wilks under AT&T Software Sales license. Even though there is many software available for statistical analysis and data management, the advantages of R in respect to other statistical packages are:

- It is free. This point is specially relevant because the licenses for most of the statistical packages are very expensive.

- It is universal. R can be run in several platforms: Windows, Linux, Mac.

- It is in continuous improvement. R is the most widespread option among those statisticians and programmers developing new statistical methods. New R libraries are continuously being appearing. It is also possible to produce R libraries by our own.

- Help. R has an exceptional help system both implemented in the correspondent libraries (trough the help() or ?? functions) and in a large collection of books, some of them free. There is also a huge amount of active users sharing their expertise in forums and blogs.

- Graphical output. R has the best graphical output of all statistical packages.

- Interface trough command lines. Facilitates the learning process of statistics, as one must know what it is exactly doing. This system is perfect for reproduce an analysis with other data, share it with colleagues or publish it.

- Productivity. R is a full programming language. Compared to other statistical packages based on windows, R allows a very efficient way of working for iterative and parallel tasks.

- Scaling. It is easy to create functions and libraries from R-scripts to our own use or sharing. It facilitates enormously the reutilization.

- Open source. It is possible accessing to the code and learn how it works, or even modify it. There is not necessary to learn other programming languages, as most of the R functions are programmed in R.


R is usually operated trough R-scripts, which is the code in R describing and executing the calculus to be done. This code is a text file which can be read by all text editing software.

Personal tools
Namespaces

Variants
Actions
Navigation
Development
Toolbox
Print/export