Stanford School of Medicine

LaneConnex
Lane Medical Library & Knowledge Management Center

Off Campus Login

a division of IRT

[searching]

How can I get to speed with R for processing statistical data?

What is it?

R is a computer language and environment for statistical computing and graphics. It is widely used in biostatistics, including microarray and proteomics data analysis.

R provides a wide variety of statistical tests:
  • linear and nonlinear modelling
  • classical statistical tests
  • time-series analysis
  • classification
  • clustering
R also provides graphical techniques, although native R is not strong on graphics [example 1] [example 2], and is highly extensible, meaning you write you can modify existing code or call complete new functions.

However, R is not for everyone. If you have only basic statistical needs, or are not at least somewhat familiar with programming, R is not for you. Perhaps a program such as GraphPad's InStat or Excel's statistical functions might be preferable.

S+, a commercial equivalent of R, is by far the better solution if you require significant graphing capabilities; this language is essentially identical to R and requires purchase of a moderately priced license.

What is it for?

Here are selected examples applicable to life sciences; details are available here:
  • Microarray data analysis, particularly using BioConductor
  • Bayesian Inference
  • Cluster Analysis & Finite Mixture Models
  • Analysis of ecological and environmental data
  • Statistical Genetics
  • Machine Learning & Statistical Learning
  • Multivariate Statistics
  • Spatial Analysis of Spatial Data
  • Graphical models in R

Obtaining R

R is free and can be obtained here. It compiles and runs on: IMPORTANT: If you don't know what "compiling" means, R may not be suited to you:
  • If you know some programming, you may want to want to consult Appendix E of Using R for Introductory Statistics (Verzani 2004), available from Lane. It will provide you with the essentials of the language in very concise form.
  • If you don't know at least some programming, the learning curve for R will likely be very significant, and it might be preferable to use another tool until you are comfortable with at least one programming language.
You can also use R on computers in the M202 Computer Laboratory: all PCs in the M202 Teaching Lab have R already installed, usable by anyone with a SUNetID.

R training

  1. Hung Chen's succinct overview of R programming
  2. David Metz and Brad Hunting's excellent R tutorial:
    Other selected training documents:

Key references

Source

Lane Librarian

Record created 9/21/2006; updated 1/12/2007.

ypouliot, September 16, 2009

Click here to access LaneConnex without logging in for full access to free resources. Licensed content restricted. Click here to log in for full access to all LaneConnex resources.