NSCI 580A3

Instructors
Tai Montgomery
Erin Nishimura

wiki2016r

# Intro to R Tutorial

Open R-studio for a brief intro to R.

#### Variables

Variables can be characters, integers, numeric, logical (TRUE, FALSE), factors, etc.

#### Objects

Save a new object with

newObject <- 1

Return new object

newObject

Determine type of variables (in a vector) or the type of object:

class(newObject)

Types of basic objects:

Dimensions Homogeneous variables Heterogeneous variables
1 vector list
2 matrix (numeric) data frame
n complex custom objects

#### Vectors

newVector <- c(1,2,3)      # to create vector
newVector                  # to return vector values
newVector[1]               # to access the first element of the vector

Certain functions can be called on vectors:

length(newVector)          # report length vector
help(length)               # look up the instructions for the length function
max(newVector)
mean(newVector)

newVector + 2              # Mathematical operations can also be performed on all elements of a vector
rm(newVector)              # remove vector from the workspace

Typical syntax for executing R functions is:

functionName(object, option = <value>, option2 = <value>, option3 = <value>)

Documentation pages can show you the types of objects that can be taken as input, options available, and default settings.

To save the output of an R function, capture it in an object:

newObjectName ← functionName(object, option = <value>, option2 = <value>, option3 = <value>)

#### Data frames

x = c('a', 'b', 'c')
y = c(1,2,3)
newDF <- data.frame(x, y)     # to create a data frame
newDF                                                       # to report data frame values
newDF$x # To subset columns as vectors newDF$y
newDF[,1]        # To subset columns as vectors
newDF[,2]
newDF[1,]        # To subset rows as vectors
newDF[2,]

Functions that work on data frames:

dim(newDF)
colnames(newDF)
rownames(newDF)
rownames(newDF) <- c("gene1", "gene2", "gene3")  # Overwrite rownames
newDF
newDF[1,2]
newDF[1,2] <- 5                                  # Overwrite some types of elements
newDF

#### Getting data into R

First, make sure you're in the right directory:

getwd()                           # Get the working directory
setwd('/Users/Erin/mydirectory')  # Set a new working directory

Quick tip: You can also set the working directory under the Files Tab (lower right panel). Under More there is the option to Set As Working Directory. You can save the resulting command in your code.

Now, you can upload a file into R:

df <- read.table(‘file.txt’, header = TRUE/FALSE, sep = "\t") #Input a file named file.txt that is tab separated
help(read.table)

Common pitfall: headers can't have spaces. Get rid of 'special characters'. Make sure there are no extra trailing rows or columns.

Exercise: Make a file in a directory on the server containing the following information and name it RNAseq_stats.txt.

Sample	Input	Mapped
guts1	36636027	35201820
guts2	24131701	23305661
guts3	18635372	17951602
N21	21315252	20365046
N22	23326031	22573853
N23	31648497	30711043

Exercise: Import the data into R using read.table and save the information as an object called RNAstats.

#### Getting data out of R

write.table(newDF, ‘file.txt’)  # Saves a dataframe or matrix in a .txt file.
write(newVector, ‘file.txt’)    # Saves a vector in a .txt file.

#### Plotting

x <- (1,2,3,4,5)
y <- (10,22,15,2,20)

plot(x, y, col = "red")
help(plot)
help(par)

Exercise: Use the plot function to try to plot either the input or the mapped read counts in your RNAstats object.

#### Getting plots out of R:

pdf("filename.pdf")      # Start an output destination
plot(x, y, col = "red")  # Start plotting
dev.off()                # Turn off the plotting function

help(pdf)                # Look up more options like setting the dimensions, resolution, etc.

Quick tip: You can also save plots by locating the Export pull down menu in the Plots panel (lower right).

Here's a link to the tutorial we did in class:

newObject <- 1

newObject

class(newObject)

newVector <- c(1,2,3,4)
newVector
newVector[1]

# Functions
length(newVector)
help(length)
max(newVector)
mean(newVector)

newVector + 2
rm(newVector)
newVector
help(mean)

# Data frames
x <- c('a', 'b', 'c')
y <- c(1,2,3)
newDF <- data.frame(x,y)
newDF

newDF\$x
newDF[1,2]
newDF[,1]
newDF[1,]

dim(newDF)
colnames(newDF)
rownames(newDF)
newDF

rownames(newDF) <- c("gene1", "gene2", "gene3")
newDF
newDF[1,2] <- 5

# import export
getwd()
setwd("~/Documents/RNA-seq_part2")
getwd()

x <- c(1,2,3,4,5)

y <- c(10,20,22,15,30)
plot(x,y, col = "red")

help(plot)
help(par)

pdf("plot.pdf")
plot(x,y, col = "red")
dev.off()