====== R Cheat Sheet ======
[[http://www.r-project.org/ | R]] is a free software environment for statistical computing and graphics. These notes summarize the [[http://tryr.codeschool.com/ | free R CodeSchool tutorial]].
===== Basics =====
* ''R'' is the command-line interpreter
* ''install.packages("ggplot2") '' to install additional packages
* Expressions are evaluated and displayed e.g. 1, 1+1, "Hello World"
* Booleans are e.g. ''1=1'' , ''3>4'' , ''TRUE'', T, ''FALSE'', F
* For variable assignment ''x=1'' or ''x<-1''
* For help on a function use ''help(sum)'' , ''help(package='ggplot2')'' or ''example(sqrt)''
* Operations are ''+ - * / = <-''
* ''NA'' is used to express a missing or unknown data value. Expressions on NA return NA.
===== Vectors =====
* To create a vector, use the combine command ''c(4,7,9)''
* Vectors must be of the same type, and are cast if not (e.g. to strings).
* ''a:b'' creates a vector of integers from a to b.
* ''seq(a,b,s)'' creates a vector of numbers from a to b in increments of s
* ''myseq[3]'' to access third element i.e. vectors indexed starting at 1.
* Use a vector as an index to access multiple elements e.g. ''myseq[c(1,3)]''
* The ''names'' function can be used to assign names to vector elements. Once names are asigned, they can be used as indices e.g.
names(myseq)=c('one','two','three')
myseq['two']
* ''myseq + 1'' adds one to all elements of the myseq vector.
* Scalar operations or functions on vectors typically produce other vectors e.g. + - == sin(myseq)
* ''head(myvec)'' , ''tail(myvec)'' to show start or end of vector
===== Plotting =====
* ''barplot[myseq]'' creates a bar plot of the ''myseq'' vector. ''abline(h=y)'' plots a horizontal line at height y.
* ''plot[x,y]'' plots x vs y e.g.
x=seq[0,20,.1]
y=sin(x)
plot(x,y)
* ''contour(mymat)'' plots a contour map of a matrix.
* ''persp(mymat)'' plots a contour map in perspective.
* ''image(volcano)'' generates a heat map of the matrix.
* ''qplot(weights, prices, color=types)'' - more attractive plotting using ggplot2 package.
===== Matrices =====
* ''matrix(0,3,4)'' creates a 3x4 matrix with all elements 0.
* ''matrix(1:12,3,4)'' creates a 3x4 matrix with numbers 1-12.
* dim(myseq) can be used to change dimensions of a matrix
* ''mymatrix[3,4]'' returns an element of the matrix (row,column).
* ''mymatrix[,2]'' returns entire second column.
===== Data Sets =====
* ''factor'' is a collection type for categorized values - ''myfac=factor(myvec)''
* ''factor''s group unique string values as ''level''s e.g. levels(myfac) shows unique levels.
* ''as.integer(myfac)'' shows levels as integers, can be used to set plot type
* ''legend("topright", levels(types), pch=1:length(levels(types)))''
* A data frame collects sets of related values (i.e. sets of columns with values in the same order) e.g. ''mydf=data.frame(weights,prices,types)''
* To extract a column, use double-square brackets with the column index or name e.g. ''mydf%%[['weights']]%%'' or just a dollar sign e.g. ''treasure$prices''
* ''merge'' merges data sets by joining on shared column names
===== Statistics =====
* ''mean(myvec) median(myvec) sd(myvec)''
* ''cor.test'' tests for correlation (Pearson's product-moment)
* ''line = lm(cola ~ colb)'' calculates a linear model between cola and colb that can be plotted with ''abline(line)''
*
===== File Handling =====
* ''list.files()'' to list files in furrent directory
* ''source("file.R")'' to load file of code
* ''read.csv('mydat.csv')'' to load a csv file
* ''read.table'' to read text data with other separators
* ''con<-url("http://google.com","r")'' to read a webpage
* ''x<-readLines(con)'' to convert to a vector of lines