<- -5:5) (x
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
In the Section 1.5, we discussed about different types of R objects. For example, a vector can be a certain data type with a set number of elements. Here we construct a vector called x
increasing from -5 to 5 by one unit:
<- -5:5) (x
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
The vector x
has 11 elements. If I want to know what the 6th element of x
, I can index the 6th element from a vector. To do this, we use []
square brackets on x
to index it. For example, we index the 6th element of x
:
6] x[
[1] 0
When ever we use []
next to an R object, it will print out the data to a specific value inside the square brackets. We can index an R object with multiple values:
1:3] x[
[1] -5 -4 -3
c(3,9)] x[
[1] -3 3
Notice how the second line uses the c()
. This is necessary when we want to specify non-contiguous elements. Now let’s see how we can index a matrix
A matrix can be indexed the same way as a vector using the []
brackets. However, since the matrix is a 2-dimensional objects, we will need to include a comma to represent the different dimensions: [,]
. The first element indexes the row and the second element indexes the columns. To begin, we create the following \(4 \times 3\) matrix:
<- matrix(1:12, nrow = 4, ncol = 3)) (x
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
Now to index the element at row 2 and column 3, use x[2, 3]
:
2, 3] x[
[1] 10
We can also index a specific row and column:
2,] x[
[1] 2 6 10
3] x[,
[1] 9 10 11 12
There are several ways to index a data frame, since it is in a matrix format, you can index it the same way as a matrix. Here are a couple of examples using the mtcars
data frame.
2] mtcars[,
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
2,] mtcars[
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
However, a data frame has labeled components, variables, we can index the data frame with the variable names within the brackets:
"cyl"] mtcars[,
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
Lastly, a data frame can be indexed to a specific variable using the $
notation as described in Section 1.5.5.
As described in Section 1.5.6, lists contain elements holding different R objects. To index a specific element of a list, you will use [[]]
double brackets. Below is a toy list:
<- list(mtcars = mtcars,
toy_list vector = rep(0, 4),
identity = diag(rep(1, 3)))
To access the second element, vector element, you can type toy_list[[2]]
2]] toy_list[[
[1] 0 0 0 0
Since the elements are labeled within the list, you can place the label in quotes inside [[]]
:
"vector"]] toy_list[[
[1] 0 0 0 0
The element can be accessed using the $
notation with a list:
$vector toy_list
[1] 0 0 0 0
Lastly, you can further index the list if needed, we can access the mpg
variable in mtcars
from the toy_list
:
$mtcars$mpg toy_list
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
"mtcars"]]$mpg toy_list[[
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
$mtcars[,'mpg'] toy_list
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
In R, there are control flow functions that will dictate how a program will be executed. The first set of functions we will talk about are if
and else
statements. First, the if
statement will evaluate a task, If the conditions is satisfied, yields TRUE
, then it will conduct a certain task, if it fails, yields FALSE
, the else
statement will guide it to a different task. Below is a general format:
Below is an example where we generate x
from a standard normal distribution and print the statement ‘positive’ or ‘non-positive’ based on the condition x > 0
.
<- rnorm(1)
x
## if statements
if (x > 0){
print("Positive")
else {
} print("Non-Positive")
}
[1] "Non-Positive"
What if we want to print the statement ‘negative’ as well if the value is negative? We will then need to add another if
statement after the else
statement since x > 0
only lets us know if the value is positive.
<- rnorm(1)
x
if (x > 0){
print("Positive")
else if (x < 0) {
} print("Negative")
}
[1] "Negative"
Above, we add the if
statement with condition (x < 0)
indicating if the number is negative. Lastly, if x
is ever \(0\), we will want R to let us know it is \(0\). We can achieve this by adding one last else
statement:
<- rnorm(1)
x
if (x > 0){
print("Positive")
else if (x < 0) {
} print("Negative")
else {
} print("Zero")
}
[1] "Positive"
for
loopsA for loop
is a way to repeat a task a certain amount of times. Every time a loop repeats a task, we state it is an iteration of the loop. For each iteration, we may change the inputs by a certain way, either from an indexed vector, and repeat the task. The general anatomy of a loop looks like:
The for
statement indicates that you will repeat a task inside the brackets. The i
in the parenthesis controls how the task will be completed. The in
statement tells R where i
can look for the values, and vectorr
is a vector R object that contains the values i
can be. It also controls how many times the task will be repeated based on the length of the vector.
Learning about a loop is quite challenging, my recommendation is to read the section below and break the example code so you can understand how a for
loop works.
for
loopLet’s say we want R to print one to five separately. We can achieve this by repeating the print()
5 times.
print(1); print(2); print(3); print(4); print(5)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
However, this takes quite awhile to type up. Let’s try to achieve the same task using a for
loop.
for (i in 1:5){
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
Here, i
will take a value from the vector 1:5
,1 Then, R will print out what the value of i
is.
Now, let’s try another example with letters. To begin, create a new vector called letters_10
containing the first 10 letters of the alphabet. Use the vector letters
to construct the neww vector.
<- letters[1:10] letters_10
Now, we will use a loop to print out the first 10 letters:
for (i in 1:10) {
print(letters_10[i])
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"
Here, we have i
take on the values 1 through 10. Using those values, we will index the vector letters_10
by i
. The resulting letter will then be printed. This task repeated 10 times.
Lastly, we can replace 1:10
by letters_10
instead:
for (i in letters_10){
print(i)
}
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"
This is because letters_10
are the values that we want to print and i
takes on the value of letters_10
each time.
for
loopsA nested for
loop is a loop that contain a loop within. Below is an example of 3 for
loops nested within each other. Below is a general example:
As an example, we will use the greekLetter::
2 and use the greek_vector
vector to obtain greek letters in R. Lastly, create a vector called greek_10
.
library(greekLetters)
<- greek_vector[1:10] greek_10
For this example, we want R to print “a” and “\(\alpha\)” together as demonstrated below3:
print(paste0(letters_10[1], greek_10[1]))
[1] "aα"
Now let’s repeat this process to print all possible combinations of the first 3 letters and 3 greek letters:
for (i in 1:3){
for (ii in 1:3){
print(paste0(letters_10[i], greek_10[ii]))
} }
[1] "aα"
[1] "aβ"
[1] "aγ"
[1] "bα"
[1] "bβ"
[1] "bγ"
[1] "cα"
[1] "cβ"
[1] "cγ"
break
A break
statement is used to stop a loop midway if a certain condition is met. A general setup of break
statement goes as follows:
As you can see there is an if
statement in the loop. This is used to tell R when to break the loop. If the if
statement was not there, then the loop will break without iterating.
To demonstrate the break statement, we will simulate from a \(N(1,1)\) until we have 30 positive numbers or we simulate a negative number.
<- rep(NA,length = 30)
x
for (i in seq_along(x)){
<- rnorm(1,1)
y if (y<0) {
break
}else {
<- y
x[i]
}
}print(x)
[1] 0.9773483 1.7917295 1.2907964 2.6436599 0.8576497 2.0094081 2.2106120
[8] 1.5479097 NA NA NA NA NA NA
[15] NA NA NA NA NA NA NA
[22] NA NA NA NA NA NA NA
[29] NA NA
print(y)
[1] -0.1423665
Notice that the vector does not get filled up all the way, that is because the loop will break once a negative number is simulated
next
Similar to the break
statement, the next
statement is used in loops that will tell R to move on to the next iteration if a certain condition is met.
The main difference here is that a next
statement is used instead of a break
statement.
Going back to simulating positive numbers, we will use the same setup but change it to a next
statement.
<- rep(NA,length = 30)
x
for (i in seq_along(x)){
<- rnorm(1,1)
y if (y<0) {
next
}else {
<- y
x[i]
}
}print(x)
[1] 0.91270459 3.08197264 0.72089222 0.18219414 0.23886831 1.36808335
[7] 3.40865796 0.27035560 NA NA 1.56910622 1.27601839
[13] 1.93305823 0.77069335 1.41109492 1.97015699 1.51544717 1.24035773
[19] NA NA 0.05664593 1.96861889 1.14983838 1.10942886
[25] 1.16644878 0.29784533 0.48227478 1.51119269 1.30830747 1.39608537
As you can see, the vector contains missing values, these were the iterations that a negative number was simulated.
while
loopThe last loop that we will discuss is a while loop. The while loop is used to keep a loop running until a certain condition is met. To construct a while loop, we will use the while
statement with a condition attached to it. In general, a while loop will have the following format:
Above, we see that the while
statement is used followed by a condition. Then the loop will complete its task and update the condition. If the condition yields a FALSE
value, then the loop will stop. Otherwise, it will continue.
while
loopsTo implement a basic while
loop, we will work on the previous example of simulating positive numbers. We want to simulate 30 positive numbers from \(N(0,1)\) until we have 30 values. Here, our condition is that we need to have 30 numbers. Therefore we can use the following code to simulate the values:
<- c()
x <- 0
size while (size < 30){
<- rnorm(1)
y if (y > 0) {
<- c(x, y)
x
}<- length(x)
size
}print(size)
[1] 30
print(x)
[1] 0.005245146 0.280625589 0.526876606 0.822030249 2.246205824 1.952491935
[7] 0.670100699 2.311234135 0.688772123 0.320199750 1.397227140 0.830938390
[13] 0.178526953 0.478543903 0.329451783 0.170460111 0.838914598 0.532007459
[19] 1.308559454 1.807544365 0.102020257 0.556702144 0.914246544 1.661145724
[25] 0.128944880 0.479945924 0.034947857 0.153277439 0.011630151 0.856472034
Notice that we do not use an else
statement. This is because we do not need R to complete a task if the condition fails.
while
loopsWith while loops, we must be weary about potential infinite loops. This occurs when the condition will never yield a FALSE
value. Therfore, R will never stop the loop because it does not know when to do this.
For example, let’s say we are interest if \(y=sin(x)\) will converge to a certain value. As you know it will not converge to a certain value; however, we can construct a while loop:
<- 1
x <- 1
diff while (diff > 1e-20) {
<- x
old_x <- x + 1
x <- abs(sin(x) - sin(old_x))
diff
}print(x)
print(diff)
My condition above is to see if the absolute difference between sequential values is smaller than \(10^{-20}\). As you may know, the absolute difference will never become that small. Therefore, the loop will continue on without stopping.
To prevent an infinite while loop, we can add a counter to the condition statement. This counter will also need to be true for the loop to continue. Therefore, we can arbitrarily stop it when the loop has iterated a certain amount of times. We just need to make sure to add one to the counter every time it iterates it. Below is the code that adds a counter to the while loop:
<- 1
x <- 0
counter <- 1
diff while (diff > 1e-20 & counter < 10^3) {
<- x
old_x <- x + 1
x <- abs(sin(x) - sin(old_x))
diff <- counter + 1
counter
}print(x)
[1] 1001
print(diff)
[1] 0.09311106
print(counter)
[1] 1000