Lists and Advanced Indexing
Overview
Class Date: 9/10/2024 -- In Class
Teaching: 90 min
Exercises: 30 minQuestions
What are lists and what is their relationship to vectors and data frames?
How can we leverage indexing for more advanced data extraction?
Objectives
Understand the structure and poperties of lists.
Be able to apply several techniques for extracting targeted data from data frames.
Combine different methods for accessing data with the assignment operator to update subsets of data.
In Class
Lists
Lists in R act as generalized containers. A list is a special type of vector, but unlike atomic vectors, the contents of a list are not restricted to a single mode and can encompass any mixture of data types. Lists are sometimes called “generic vectors”, because the elements of a list can be any type of R object, even lists containing further lists. This property makes them fundamentally different from atomic vectors.
Create lists using list()
:
x <- list(1, "a", TRUE, 1+4i)
x
[[1]]
[1] 1
[[2]]
[1] "a"
[[3]]
[1] TRUE
[[4]]
[1] 1+4i
A list does not print to the console like a vector. Instead, each element of the list starts on a new line. The reason is that any object can be placed as an element in a list, including larger objects like data frames. Placing each element on a sparate line allows room for these larger objects to be displayed.
Note that an empty list of the required length can also be created using vector()
.
x <- vector("list", length = 5) # empty list
x
[[1]]
NULL
[[2]]
NULL
[[3]]
NULL
[[4]]
NULL
[[5]]
NULL
length(x)
[1] 5
Coerce other objects (like vectors) to lists using as.list()
:
x <- 1:10
x <- as.list(x)
class(x)
[1] "list"
length(x)
[1] 10
Indexing lists
Indexing works a bit differently for lists. The content of elements of a list can be retrieved by using double square brackets [[n]]
, as opposed to the single square brackets [n]
used for vectors and matrices.
x[[1]]
[1] 1
Single []
indexing still works, but returns a list with the indexed elements. Double [[]]
indexing returns the object inside the requested element itself. With that in mind, consider the following exercise:
Examining Lists
- What is the class of
x[1]
?- What is the class of
x[[1]]
?Solution
class(x[1])
[1] "list"
class(x[[1]])
[1] "integer"
Another consequence of the difference between []
and [[]]
indexing is that you cannot request that R return more than one element from a list using [[]]
:
x[[c(1,2)]]
Error in x[[c(1, 2)]]: subscript out of bounds
Because the [[]]
index returns the object inside the requested list element, entering a range of indexes actually requests that R return multiple separate objects, possibly with different classes. If you want to return a list that is a subset of the current list, use []
:
y <- x[c(1,2)]
y
[[1]]
[1] 1
[[2]]
[1] 2
class(y)
[1] "list"
Elements of a list can be named (i.e. lists can have the names
attribute)
xlist <- list(a = "Karthik Ram", b = 1:10, data = head(iris))
xlist
$a
[1] "Karthik Ram"
$b
[1] 1 2 3 4 5 6 7 8 9 10
$data
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
names(xlist)
[1] "a" "b" "data"
attributes(xlist)
$names
[1] "a" "b" "data"
You can use the $
operator to directly refer to list elements using their name. These are equivalent requests:
xlist[[1]]
[1] "Karthik Ram"
xlist$a
[1] "Karthik Ram"
Examining Named Lists
- What is the length of the
xlist
object?- What is its structure?
Solution
length(xlist)
[1] 3
str(xlist)
List of 3 $ a : chr "Karthik Ram" $ b : int [1:10] 1 2 3 4 5 6 7 8 9 10 $ data:'data.frame': 6 obs. of 5 variables: ..$ Sepal.Length: num [1:6] 5.1 4.9 4.7 4.6 5 5.4 ..$ Sepal.Width : num [1:6] 3.5 3 3.2 3.1 3.6 3.9 ..$ Petal.Length: num [1:6] 1.4 1.4 1.3 1.5 1.4 1.7 ..$ Petal.Width : num [1:6] 0.2 0.2 0.2 0.2 0.2 0.4 ..$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1
Lists of lists!
A list can even have vectors (or other lists!) as one element of a list:
x <- 1:10
y <- c(T, F, T, T) # "T" can be used in place of "TRUE"; "F" can be used in place of "FALSE"
z <- list(1, "a", TRUE, 1+4i)
my.list <- list(x, y, z)
my.list
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
[[2]]
[1] TRUE FALSE TRUE TRUE
[[3]]
[[3]][[1]]
[1] 1
[[3]][[2]]
[1] "a"
[[3]][[3]]
[1] TRUE
[[3]][[4]]
[1] 1+4i
my.list[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
class(my.list[[1]])
[1] "integer"
class(my.list[[2]])
[1] "logical"
my.list <- list(x = x, y = y, z = z) # use the `=` to name your list elements
my.list$x
[1] 1 2 3 4 5 6 7 8 9 10
Lists and functions
Lists can be extremely useful when used in functions. One property of functions in R is that they are able to return only a single object. To get around this restriction, you can “staple” together lots of different types of information into a single list object that a function can return.
A basic example is the function t.test()
, which performs a Student’s t test between two data samples (more on this in coming lessons). For now, let’s run a basic t-test between petal length and width in the iris
data set in order to examine the structure of the output.
l.vs.w <- t.test(iris$Petal.Length, iris$Petal.Width)
l.vs.w
Welch Two Sample t-test
data: iris$Petal.Length and iris$Petal.Width
t = 16.297, df = 202.69, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
2.249107 2.868227
sample estimates:
mean of x mean of y
3.758000 1.199333
The t-test output returns more than just a p-value; it returns a complex set of information about the parameters and statistics of the performed test, for instance:
- test alternatives: two-sided (character)
- specific type of t-test: Welch Two Sample t-test (character)
- the variables compared: “iris$Petal.Length and iris$Petal.Width” (character)
- t statistic value (numeric)
- number of degrees of freedom (df; numeric)
- confidence interval (numeric vector)
R can return all of this information in a single output because each value is contained as a separate element of a list:
typeof(l.vs.w)
[1] "list"
str(l.vs.w)
List of 10
$ statistic : Named num 16.3
..- attr(*, "names")= chr "t"
$ parameter : Named num 203
..- attr(*, "names")= chr "df"
$ p.value : num 1.03e-38
$ conf.int : num [1:2] 2.25 2.87
..- attr(*, "conf.level")= num 0.95
$ estimate : Named num [1:2] 3.76 1.2
..- attr(*, "names")= chr [1:2] "mean of x" "mean of y"
$ null.value : Named num 0
..- attr(*, "names")= chr "difference in means"
$ stderr : num 0.157
$ alternative: chr "two.sided"
$ method : chr "Welch Two Sample t-test"
$ data.name : chr "iris$Petal.Length and iris$Petal.Width"
- attr(*, "class")= chr "htest"
Because this data is formatted as a list, we can extract specific components for further use (e.g. to print onto a chart, or use to separate signficant comparisons from a computed set of t-tests) using either [[]]
or $
. Let’s pull the p-value for instance:
l.vs.w$p.value
[1] 1.026755e-38
l.vs.w[[3]]
[1] 1.026755e-38
Data frames are specialized lists
At its heart, the data frame is a special type of list in which every element is a vector with the same length as every other element. In other words, a data frame is a “rectangular” or “two-dimensional” list.
dat <- data.frame(id = letters[1:10], x = 1:10, y = 11:20)
dat
id x y
1 a 1 11
2 b 2 12
3 c 3 13
4 d 4 14
5 e 5 15
6 f 6 16
7 g 7 17
8 h 8 18
9 i 9 19
10 j 10 20
we can see that R considers a data frame a list using the is.list()
function, and by examining the underlying data type:
is.list(dat)
[1] TRUE
is.data.frame(dat) # "data.frame" is a sub-class of "list"
[1] TRUE
class(dat)
[1] "data.frame"
typeof(dat)
[1] "list"
The formal definition of a data frame as a list is why we can use [[]]
and $
to rapidly interact with data in a data frame.
dat$id
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
dat[[2]]
[1] 1 2 3 4 5 6 7 8 9 10
While the restriction that all elements (aka columns) must have the same length allows us to treat data frames as two-dimensional structures and use the [x,y]
indexing format similar to matrices:
dat[3,2]
[1] 3
dat[2:8,2:3]
x y
2 2 12
3 3 13
4 4 14
5 5 15
6 6 16
7 7 17
8 8 18
Advanced indexing
During the In Class component of this lesson, we examined the three primary modes of indexing data frames and other R objects:
- By index
- By name
- By logical vector
Here we will expand on these basic concepts to demonstrate different way to extract useful information from datasets represented in data frame objects.
Logical indexing – additional detail
We noted In Class that logical indexing is one of the more powerful ways to index data. There are many operators that expand our logical toolbox.
Relational operators
The most common application of logical indexing is to use the information contained within one or more variables (i.e. columns) of a data frame to extract a subset of data that has a desired set of relational properties. To do so, we first need to generate a logical vector based on the information in the variable(s) of interest, which we will then use to index the data frame.
Logical vectors can be created using relational operators
:
<
= less than>
= greater than<=
= less than or equal to>=
= greater than or equal to==
= exactly equal to!=
= not equal to%in%
= is present in (used to ask if the value(s) on the left is present in the vector/matrix on the right)
A few single variable examples:
1 == 1
[1] TRUE
1 == 2
[1] FALSE
1 != 1
[1] FALSE
4 > 7
[1] FALSE
18 %in% 1:10
[1] FALSE
18 %in% 15:25
[1] TRUE
We can use these operators to query entire vectors and generate logical vectors:
# creating logical vectors from numeric data
x <- c(1, 2, 3, 11, 12, 13)
x < 10
[1] TRUE TRUE TRUE FALSE FALSE FALSE
x %in% 1:10
[1] TRUE TRUE TRUE FALSE FALSE FALSE
Each comparison generates a logical vector as output with the same number of elements as the vector on the left side of the relational operator, evaluating each input element relative to the right side of the operator. We can use logical vectors to select data from a data frame.
index <- iris$Species == 'setosa'
index
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[13] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[25] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[37] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[49] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[145] FALSE FALSE FALSE FALSE FALSE FALSE
iris[index,]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5.0 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
11 5.4 3.7 1.5 0.2 setosa
12 4.8 3.4 1.6 0.2 setosa
13 4.8 3.0 1.4 0.1 setosa
14 4.3 3.0 1.1 0.1 setosa
15 5.8 4.0 1.2 0.2 setosa
16 5.7 4.4 1.5 0.4 setosa
17 5.4 3.9 1.3 0.4 setosa
18 5.1 3.5 1.4 0.3 setosa
19 5.7 3.8 1.7 0.3 setosa
20 5.1 3.8 1.5 0.3 setosa
21 5.4 3.4 1.7 0.2 setosa
22 5.1 3.7 1.5 0.4 setosa
23 4.6 3.6 1.0 0.2 setosa
24 5.1 3.3 1.7 0.5 setosa
25 4.8 3.4 1.9 0.2 setosa
26 5.0 3.0 1.6 0.2 setosa
27 5.0 3.4 1.6 0.4 setosa
28 5.2 3.5 1.5 0.2 setosa
29 5.2 3.4 1.4 0.2 setosa
30 4.7 3.2 1.6 0.2 setosa
31 4.8 3.1 1.6 0.2 setosa
32 5.4 3.4 1.5 0.4 setosa
33 5.2 4.1 1.5 0.1 setosa
34 5.5 4.2 1.4 0.2 setosa
35 4.9 3.1 1.5 0.2 setosa
36 5.0 3.2 1.2 0.2 setosa
37 5.5 3.5 1.3 0.2 setosa
38 4.9 3.6 1.4 0.1 setosa
39 4.4 3.0 1.3 0.2 setosa
40 5.1 3.4 1.5 0.2 setosa
41 5.0 3.5 1.3 0.3 setosa
42 4.5 2.3 1.3 0.3 setosa
43 4.4 3.2 1.3 0.2 setosa
44 5.0 3.5 1.6 0.6 setosa
45 5.1 3.8 1.9 0.4 setosa
46 4.8 3.0 1.4 0.3 setosa
47 5.1 3.8 1.6 0.2 setosa
48 4.6 3.2 1.4 0.2 setosa
49 5.3 3.7 1.5 0.2 setosa
50 5.0 3.3 1.4 0.2 setosa
Often this operation is written as one line of code:
iris[iris$Species == 'setosa', ]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5.0 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
11 5.4 3.7 1.5 0.2 setosa
12 4.8 3.4 1.6 0.2 setosa
13 4.8 3.0 1.4 0.1 setosa
14 4.3 3.0 1.1 0.1 setosa
15 5.8 4.0 1.2 0.2 setosa
16 5.7 4.4 1.5 0.4 setosa
17 5.4 3.9 1.3 0.4 setosa
18 5.1 3.5 1.4 0.3 setosa
19 5.7 3.8 1.7 0.3 setosa
20 5.1 3.8 1.5 0.3 setosa
21 5.4 3.4 1.7 0.2 setosa
22 5.1 3.7 1.5 0.4 setosa
23 4.6 3.6 1.0 0.2 setosa
24 5.1 3.3 1.7 0.5 setosa
25 4.8 3.4 1.9 0.2 setosa
26 5.0 3.0 1.6 0.2 setosa
27 5.0 3.4 1.6 0.4 setosa
28 5.2 3.5 1.5 0.2 setosa
29 5.2 3.4 1.4 0.2 setosa
30 4.7 3.2 1.6 0.2 setosa
31 4.8 3.1 1.6 0.2 setosa
32 5.4 3.4 1.5 0.4 setosa
33 5.2 4.1 1.5 0.1 setosa
34 5.5 4.2 1.4 0.2 setosa
35 4.9 3.1 1.5 0.2 setosa
36 5.0 3.2 1.2 0.2 setosa
37 5.5 3.5 1.3 0.2 setosa
38 4.9 3.6 1.4 0.1 setosa
39 4.4 3.0 1.3 0.2 setosa
40 5.1 3.4 1.5 0.2 setosa
41 5.0 3.5 1.3 0.3 setosa
42 4.5 2.3 1.3 0.3 setosa
43 4.4 3.2 1.3 0.2 setosa
44 5.0 3.5 1.6 0.6 setosa
45 5.1 3.8 1.9 0.4 setosa
46 4.8 3.0 1.4 0.3 setosa
47 5.1 3.8 1.6 0.2 setosa
48 4.6 3.2 1.4 0.2 setosa
49 5.3 3.7 1.5 0.2 setosa
50 5.0 3.3 1.4 0.2 setosa
Using logical indices
Create a new data frame that is the subset of
iris
with sepal length greater than or equal to 5.0.Solution
iris.new <- iris[iris$Sepal.Length >= 5,]
Logical operators
In addition to the numeric comparisons, there are a set of logical operators that compare logical variables and output a new logical variable:
!
= NOT (changesTRUE
toFALSE
and vice versa)&
= element-wise AND (both are true; outputs vector for vector input comparing elements)&&
= logical AND (both are true; only considers first element of a vector)|
= element-wise OR (one or both are true; outputs vector for vector input comparing elements)||
= logical OR (both are true; only considers first index of a vector)xor(x,y)
= element-wise exclusive OR (either are true, but not both; outputs vector for vector input comparing elements)
truth <- c(TRUE, FALSE, TRUE, TRUE)
lie <- !truth
truth
[1] TRUE FALSE TRUE TRUE
lie
[1] FALSE TRUE FALSE FALSE
T & T
[1] TRUE
T & F
[1] FALSE
T | F
[1] TRUE
F | F
[1] FALSE
c(T,F,F) & c(T,T,F)
[1] TRUE FALSE FALSE
c(T,F,F) && c(T,T,F)
Error in c(T, F, F) && c(T, T, F): 'length = 3' in coercion to 'logical(1)'
Logical operators allow us to combine multiple relational operators to extract subsets of data contained within a data frame with multiple selection criteria, for example the iris
entries with extreme values for sepal length:
extremes <- iris[(iris$Sepal.Length < 4.6) | (iris$Sepal.Length > 7.3), ]
extremes
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
9 4.4 2.9 1.4 0.2 setosa
14 4.3 3.0 1.1 0.1 setosa
39 4.4 3.0 1.3 0.2 setosa
42 4.5 2.3 1.3 0.3 setosa
43 4.4 3.2 1.3 0.2 setosa
106 7.6 3.0 6.6 2.1 virginica
118 7.7 3.8 6.7 2.2 virginica
119 7.7 2.6 6.9 2.3 virginica
123 7.7 2.8 6.7 2.0 virginica
131 7.4 2.8 6.1 1.9 virginica
132 7.9 3.8 6.4 2.0 virginica
136 7.7 3.0 6.1 2.3 virginica
Note the use of parentheses ()
to break up each logical operation. These are not always absolutely necessary for the code to run properly, but are generally a good idea to use because they always assist you (the coder) to explicitly breakup the order of operations with complex statements.
Combined indexing
One way to leverage indexing is to combine different indexing categories. For instance, column names can be combined with direct indexing to return a data frame subset only containing specific columns:
length.only <- iris[,c("Petal.Length", "Sepal.Length", "Species")]
head(length.only)
Petal.Length Sepal.Length Species
1 1.4 5.1 setosa
2 1.4 4.9 setosa
3 1.3 4.7 setosa
4 1.5 4.6 setosa
5 1.4 5.0 setosa
6 1.7 5.4 setosa
On occasion, it is useful to convert a logical vector into a set of numbered indexes, which we can accomplish using the which()
function:
x <- c(T, T, F, F, T, F)
which(x)
[1] 1 2 5
This simple example illustrates that which()
returns a vector containing the numbered index of each TRUE
in the vector x
. We can apply this to our iris
dataset:
index <- which(iris$Sepal.Length > 5.0)
index
[1] 1 6 11 15 16 17 18 19 20 21 22 24 28 29 32 33 34 37
[19] 40 45 47 49 51 52 53 54 55 56 57 59 60 62 63 64 65 66
[37] 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
[55] 85 86 87 88 89 90 91 92 93 95 96 97 98 99 100 101 102 103
[73] 104 105 106 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122
[91] 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140
[109] 141 142 143 144 145 146 147 148 149 150
We now have a list of the index positions of rows in the iris
data frame for the flowers with the largest sepal lengths. We can now use this information to, say, extract all information about these flowers:
long.sepal <- iris[index,]
head(long.sepal)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
11 5.4 3.7 1.5 0.2 setosa
15 5.8 4.0 1.2 0.2 setosa
16 5.7 4.4 1.5 0.4 setosa
17 5.4 3.9 1.3 0.4 setosa
Or just information on a specific variable:
iris$Sepal.Width[index]
[1] 3.5 3.9 3.7 4.0 4.4 3.9 3.5 3.8 3.8 3.4 3.7 3.3 3.5 3.4 3.4 4.1 4.2 3.5
[19] 3.4 3.8 3.8 3.7 3.2 3.2 3.1 2.3 2.8 2.8 3.3 2.9 2.7 3.0 2.2 2.9 2.9 3.1
[37] 3.0 2.7 2.2 2.5 3.2 2.8 2.5 2.8 2.9 3.0 2.8 3.0 2.9 2.6 2.4 2.4 2.7 2.7
[55] 3.0 3.4 3.1 2.3 3.0 2.5 2.6 3.0 2.6 2.7 3.0 2.9 2.9 2.5 2.8 3.3 2.7 3.0
[73] 2.9 3.0 3.0 2.9 2.5 3.6 3.2 2.7 3.0 2.5 2.8 3.2 3.0 3.8 2.6 2.2 3.2 2.8
[91] 2.8 2.7 3.3 3.2 2.8 3.0 2.8 3.0 2.8 3.8 2.8 2.8 2.6 3.0 3.4 3.1 3.0 3.1
[109] 3.1 3.1 2.7 3.2 3.3 3.0 2.5 3.0 3.4 3.0
Using which()
in this way is an efficient way to store information on a relevant data subset that you will be interacting with more than once, without creating a whole separate data frame.
Updating subsets
We can use the assignement operator <-
to directly update a subset of a vector or data frame.
x <- 1:20
x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x[x > 15] <- 100
x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 100 100 100 100
[20] 100
Say you discover that the values collected for the petal length of the setosa
species of iris
were incorrectly recorded, and you want to replace these values with NA
to ensure that they are not used in future work.
# copy the data frame (never modify your raw data!)
iris.corrected <- iris
iris.corrected$Petal.Length[iris.corrected$Species == "setosa"] <- NA
iris.corrected
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 NA 0.2 setosa
2 4.9 3.0 NA 0.2 setosa
3 4.7 3.2 NA 0.2 setosa
4 4.6 3.1 NA 0.2 setosa
5 5.0 3.6 NA 0.2 setosa
6 5.4 3.9 NA 0.4 setosa
7 4.6 3.4 NA 0.3 setosa
8 5.0 3.4 NA 0.2 setosa
9 4.4 2.9 NA 0.2 setosa
10 4.9 3.1 NA 0.1 setosa
11 5.4 3.7 NA 0.2 setosa
12 4.8 3.4 NA 0.2 setosa
13 4.8 3.0 NA 0.1 setosa
14 4.3 3.0 NA 0.1 setosa
15 5.8 4.0 NA 0.2 setosa
16 5.7 4.4 NA 0.4 setosa
17 5.4 3.9 NA 0.4 setosa
18 5.1 3.5 NA 0.3 setosa
19 5.7 3.8 NA 0.3 setosa
20 5.1 3.8 NA 0.3 setosa
21 5.4 3.4 NA 0.2 setosa
22 5.1 3.7 NA 0.4 setosa
23 4.6 3.6 NA 0.2 setosa
24 5.1 3.3 NA 0.5 setosa
25 4.8 3.4 NA 0.2 setosa
26 5.0 3.0 NA 0.2 setosa
27 5.0 3.4 NA 0.4 setosa
28 5.2 3.5 NA 0.2 setosa
29 5.2 3.4 NA 0.2 setosa
30 4.7 3.2 NA 0.2 setosa
31 4.8 3.1 NA 0.2 setosa
32 5.4 3.4 NA 0.4 setosa
33 5.2 4.1 NA 0.1 setosa
34 5.5 4.2 NA 0.2 setosa
35 4.9 3.1 NA 0.2 setosa
36 5.0 3.2 NA 0.2 setosa
37 5.5 3.5 NA 0.2 setosa
38 4.9 3.6 NA 0.1 setosa
39 4.4 3.0 NA 0.2 setosa
40 5.1 3.4 NA 0.2 setosa
41 5.0 3.5 NA 0.3 setosa
42 4.5 2.3 NA 0.3 setosa
43 4.4 3.2 NA 0.2 setosa
44 5.0 3.5 NA 0.6 setosa
45 5.1 3.8 NA 0.4 setosa
46 4.8 3.0 NA 0.3 setosa
47 5.1 3.8 NA 0.2 setosa
48 4.6 3.2 NA 0.2 setosa
49 5.3 3.7 NA 0.2 setosa
50 5.0 3.3 NA 0.2 setosa
51 7.0 3.2 4.7 1.4 versicolor
52 6.4 3.2 4.5 1.5 versicolor
53 6.9 3.1 4.9 1.5 versicolor
54 5.5 2.3 4.0 1.3 versicolor
55 6.5 2.8 4.6 1.5 versicolor
56 5.7 2.8 4.5 1.3 versicolor
57 6.3 3.3 4.7 1.6 versicolor
58 4.9 2.4 3.3 1.0 versicolor
59 6.6 2.9 4.6 1.3 versicolor
60 5.2 2.7 3.9 1.4 versicolor
61 5.0 2.0 3.5 1.0 versicolor
62 5.9 3.0 4.2 1.5 versicolor
63 6.0 2.2 4.0 1.0 versicolor
64 6.1 2.9 4.7 1.4 versicolor
65 5.6 2.9 3.6 1.3 versicolor
66 6.7 3.1 4.4 1.4 versicolor
67 5.6 3.0 4.5 1.5 versicolor
68 5.8 2.7 4.1 1.0 versicolor
69 6.2 2.2 4.5 1.5 versicolor
70 5.6 2.5 3.9 1.1 versicolor
71 5.9 3.2 4.8 1.8 versicolor
72 6.1 2.8 4.0 1.3 versicolor
73 6.3 2.5 4.9 1.5 versicolor
74 6.1 2.8 4.7 1.2 versicolor
75 6.4 2.9 4.3 1.3 versicolor
76 6.6 3.0 4.4 1.4 versicolor
77 6.8 2.8 4.8 1.4 versicolor
78 6.7 3.0 5.0 1.7 versicolor
79 6.0 2.9 4.5 1.5 versicolor
80 5.7 2.6 3.5 1.0 versicolor
81 5.5 2.4 3.8 1.1 versicolor
82 5.5 2.4 3.7 1.0 versicolor
83 5.8 2.7 3.9 1.2 versicolor
84 6.0 2.7 5.1 1.6 versicolor
85 5.4 3.0 4.5 1.5 versicolor
86 6.0 3.4 4.5 1.6 versicolor
87 6.7 3.1 4.7 1.5 versicolor
88 6.3 2.3 4.4 1.3 versicolor
89 5.6 3.0 4.1 1.3 versicolor
90 5.5 2.5 4.0 1.3 versicolor
91 5.5 2.6 4.4 1.2 versicolor
92 6.1 3.0 4.6 1.4 versicolor
93 5.8 2.6 4.0 1.2 versicolor
94 5.0 2.3 3.3 1.0 versicolor
95 5.6 2.7 4.2 1.3 versicolor
96 5.7 3.0 4.2 1.2 versicolor
97 5.7 2.9 4.2 1.3 versicolor
98 6.2 2.9 4.3 1.3 versicolor
99 5.1 2.5 3.0 1.1 versicolor
100 5.7 2.8 4.1 1.3 versicolor
101 6.3 3.3 6.0 2.5 virginica
102 5.8 2.7 5.1 1.9 virginica
103 7.1 3.0 5.9 2.1 virginica
104 6.3 2.9 5.6 1.8 virginica
105 6.5 3.0 5.8 2.2 virginica
106 7.6 3.0 6.6 2.1 virginica
107 4.9 2.5 4.5 1.7 virginica
108 7.3 2.9 6.3 1.8 virginica
109 6.7 2.5 5.8 1.8 virginica
110 7.2 3.6 6.1 2.5 virginica
111 6.5 3.2 5.1 2.0 virginica
112 6.4 2.7 5.3 1.9 virginica
113 6.8 3.0 5.5 2.1 virginica
114 5.7 2.5 5.0 2.0 virginica
115 5.8 2.8 5.1 2.4 virginica
116 6.4 3.2 5.3 2.3 virginica
117 6.5 3.0 5.5 1.8 virginica
118 7.7 3.8 6.7 2.2 virginica
119 7.7 2.6 6.9 2.3 virginica
120 6.0 2.2 5.0 1.5 virginica
121 6.9 3.2 5.7 2.3 virginica
122 5.6 2.8 4.9 2.0 virginica
123 7.7 2.8 6.7 2.0 virginica
124 6.3 2.7 4.9 1.8 virginica
125 6.7 3.3 5.7 2.1 virginica
126 7.2 3.2 6.0 1.8 virginica
127 6.2 2.8 4.8 1.8 virginica
128 6.1 3.0 4.9 1.8 virginica
129 6.4 2.8 5.6 2.1 virginica
130 7.2 3.0 5.8 1.6 virginica
131 7.4 2.8 6.1 1.9 virginica
132 7.9 3.8 6.4 2.0 virginica
133 6.4 2.8 5.6 2.2 virginica
134 6.3 2.8 5.1 1.5 virginica
135 6.1 2.6 5.6 1.4 virginica
136 7.7 3.0 6.1 2.3 virginica
137 6.3 3.4 5.6 2.4 virginica
138 6.4 3.1 5.5 1.8 virginica
139 6.0 3.0 4.8 1.8 virginica
140 6.9 3.1 5.4 2.1 virginica
141 6.7 3.1 5.6 2.4 virginica
142 6.9 3.1 5.1 2.3 virginica
143 5.8 2.7 5.1 1.9 virginica
144 6.8 3.2 5.9 2.3 virginica
145 6.7 3.3 5.7 2.5 virginica
146 6.7 3.0 5.2 2.3 virginica
147 6.3 2.5 5.0 1.9 virginica
148 6.5 3.0 5.2 2.0 virginica
149 6.2 3.4 5.4 2.3 virginica
150 5.9 3.0 5.1 1.8 virginica
Exercises
Combining logical and relational operators
Create a new data frame that is the subset of
iris
with sepal length greater than or > equal to 5.0 for the setosa species.Solution
iris.new <- iris[iris$Sepal.Length > 5 & iris$Species == "setosa",] iris.new
Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa 11 5.4 3.7 1.5 0.2 setosa 15 5.8 4.0 1.2 0.2 setosa 16 5.7 4.4 1.5 0.4 setosa 17 5.4 3.9 1.3 0.4 setosa 18 5.1 3.5 1.4 0.3 setosa 19 5.7 3.8 1.7 0.3 setosa 20 5.1 3.8 1.5 0.3 setosa 21 5.4 3.4 1.7 0.2 setosa 22 5.1 3.7 1.5 0.4 setosa 24 5.1 3.3 1.7 0.5 setosa 28 5.2 3.5 1.5 0.2 setosa 29 5.2 3.4 1.4 0.2 setosa 32 5.4 3.4 1.5 0.4 setosa 33 5.2 4.1 1.5 0.1 setosa 34 5.5 4.2 1.4 0.2 setosa 37 5.5 3.5 1.3 0.2 setosa 40 5.1 3.4 1.5 0.2 setosa 45 5.1 3.8 1.9 0.4 setosa 47 5.1 3.8 1.6 0.2 setosa 49 5.3 3.7 1.5 0.2 setosa
Subsetting using a vector or name
Use the colon operator to index the first five observations of just the sepal length and species from
iris
Solution
Two options:
iris[1:5, c(1,5)]
Sepal.Length Species 1 5.1 setosa 2 4.9 setosa 3 4.7 setosa 4 4.6 setosa 5 5.0 setosa
iris[1:5,c("Sepal.Length","Species")]
Sepal.Length Species 1 5.1 setosa 2 4.9 setosa 3 4.7 setosa 4 4.6 setosa 5 5.0 setosa
Subsetting with sequences
Use the colon operator to index just the data on sepal size from
iris
Solution
iris[, 1:2]
Sepal.Length Sepal.Width 1 5.1 3.5 2 4.9 3.0 3 4.7 3.2 4 4.6 3.1 5 5.0 3.6 6 5.4 3.9 7 4.6 3.4 8 5.0 3.4 9 4.4 2.9 10 4.9 3.1 11 5.4 3.7 12 4.8 3.4 13 4.8 3.0 14 4.3 3.0 15 5.8 4.0 16 5.7 4.4 17 5.4 3.9 18 5.1 3.5 19 5.7 3.8 20 5.1 3.8 21 5.4 3.4 22 5.1 3.7 23 4.6 3.6 24 5.1 3.3 25 4.8 3.4 26 5.0 3.0 27 5.0 3.4 28 5.2 3.5 29 5.2 3.4 30 4.7 3.2 31 4.8 3.1 32 5.4 3.4 33 5.2 4.1 34 5.5 4.2 35 4.9 3.1 36 5.0 3.2 37 5.5 3.5 38 4.9 3.6 39 4.4 3.0 40 5.1 3.4 41 5.0 3.5 42 4.5 2.3 43 4.4 3.2 44 5.0 3.5 45 5.1 3.8 46 4.8 3.0 47 5.1 3.8 48 4.6 3.2 49 5.3 3.7 50 5.0 3.3 51 7.0 3.2 52 6.4 3.2 53 6.9 3.1 54 5.5 2.3 55 6.5 2.8 56 5.7 2.8 57 6.3 3.3 58 4.9 2.4 59 6.6 2.9 60 5.2 2.7 61 5.0 2.0 62 5.9 3.0 63 6.0 2.2 64 6.1 2.9 65 5.6 2.9 66 6.7 3.1 67 5.6 3.0 68 5.8 2.7 69 6.2 2.2 70 5.6 2.5 71 5.9 3.2 72 6.1 2.8 73 6.3 2.5 74 6.1 2.8 75 6.4 2.9 76 6.6 3.0 77 6.8 2.8 78 6.7 3.0 79 6.0 2.9 80 5.7 2.6 81 5.5 2.4 82 5.5 2.4 83 5.8 2.7 84 6.0 2.7 85 5.4 3.0 86 6.0 3.4 87 6.7 3.1 88 6.3 2.3 89 5.6 3.0 90 5.5 2.5 91 5.5 2.6 92 6.1 3.0 93 5.8 2.6 94 5.0 2.3 95 5.6 2.7 96 5.7 3.0 97 5.7 2.9 98 6.2 2.9 99 5.1 2.5 100 5.7 2.8 101 6.3 3.3 102 5.8 2.7 103 7.1 3.0 104 6.3 2.9 105 6.5 3.0 106 7.6 3.0 107 4.9 2.5 108 7.3 2.9 109 6.7 2.5 110 7.2 3.6 111 6.5 3.2 112 6.4 2.7 113 6.8 3.0 114 5.7 2.5 115 5.8 2.8 116 6.4 3.2 117 6.5 3.0 118 7.7 3.8 119 7.7 2.6 120 6.0 2.2 121 6.9 3.2 122 5.6 2.8 123 7.7 2.8 124 6.3 2.7 125 6.7 3.3 126 7.2 3.2 127 6.2 2.8 128 6.1 3.0 129 6.4 2.8 130 7.2 3.0 131 7.4 2.8 132 7.9 3.8 133 6.4 2.8 134 6.3 2.8 135 6.1 2.6 136 7.7 3.0 137 6.3 3.4 138 6.4 3.1 139 6.0 3.0 140 6.9 3.1 141 6.7 3.1 142 6.9 3.1 143 5.8 2.7 144 6.8 3.2 145 6.7 3.3 146 6.7 3.0 147 6.3 2.5 148 6.5 3.0 149 6.2 3.4 150 5.9 3.0
iris[,c("Sepal.Length", "Sepal.Width")]
Sepal.Length Sepal.Width 1 5.1 3.5 2 4.9 3.0 3 4.7 3.2 4 4.6 3.1 5 5.0 3.6 6 5.4 3.9 7 4.6 3.4 8 5.0 3.4 9 4.4 2.9 10 4.9 3.1 11 5.4 3.7 12 4.8 3.4 13 4.8 3.0 14 4.3 3.0 15 5.8 4.0 16 5.7 4.4 17 5.4 3.9 18 5.1 3.5 19 5.7 3.8 20 5.1 3.8 21 5.4 3.4 22 5.1 3.7 23 4.6 3.6 24 5.1 3.3 25 4.8 3.4 26 5.0 3.0 27 5.0 3.4 28 5.2 3.5 29 5.2 3.4 30 4.7 3.2 31 4.8 3.1 32 5.4 3.4 33 5.2 4.1 34 5.5 4.2 35 4.9 3.1 36 5.0 3.2 37 5.5 3.5 38 4.9 3.6 39 4.4 3.0 40 5.1 3.4 41 5.0 3.5 42 4.5 2.3 43 4.4 3.2 44 5.0 3.5 45 5.1 3.8 46 4.8 3.0 47 5.1 3.8 48 4.6 3.2 49 5.3 3.7 50 5.0 3.3 51 7.0 3.2 52 6.4 3.2 53 6.9 3.1 54 5.5 2.3 55 6.5 2.8 56 5.7 2.8 57 6.3 3.3 58 4.9 2.4 59 6.6 2.9 60 5.2 2.7 61 5.0 2.0 62 5.9 3.0 63 6.0 2.2 64 6.1 2.9 65 5.6 2.9 66 6.7 3.1 67 5.6 3.0 68 5.8 2.7 69 6.2 2.2 70 5.6 2.5 71 5.9 3.2 72 6.1 2.8 73 6.3 2.5 74 6.1 2.8 75 6.4 2.9 76 6.6 3.0 77 6.8 2.8 78 6.7 3.0 79 6.0 2.9 80 5.7 2.6 81 5.5 2.4 82 5.5 2.4 83 5.8 2.7 84 6.0 2.7 85 5.4 3.0 86 6.0 3.4 87 6.7 3.1 88 6.3 2.3 89 5.6 3.0 90 5.5 2.5 91 5.5 2.6 92 6.1 3.0 93 5.8 2.6 94 5.0 2.3 95 5.6 2.7 96 5.7 3.0 97 5.7 2.9 98 6.2 2.9 99 5.1 2.5 100 5.7 2.8 101 6.3 3.3 102 5.8 2.7 103 7.1 3.0 104 6.3 2.9 105 6.5 3.0 106 7.6 3.0 107 4.9 2.5 108 7.3 2.9 109 6.7 2.5 110 7.2 3.6 111 6.5 3.2 112 6.4 2.7 113 6.8 3.0 114 5.7 2.5 115 5.8 2.8 116 6.4 3.2 117 6.5 3.0 118 7.7 3.8 119 7.7 2.6 120 6.0 2.2 121 6.9 3.2 122 5.6 2.8 123 7.7 2.8 124 6.3 2.7 125 6.7 3.3 126 7.2 3.2 127 6.2 2.8 128 6.1 3.0 129 6.4 2.8 130 7.2 3.0 131 7.4 2.8 132 7.9 3.8 133 6.4 2.8 134 6.3 2.8 135 6.1 2.6 136 7.7 3.0 137 6.3 3.4 138 6.4 3.1 139 6.0 3.0 140 6.9 3.1 141 6.7 3.1 142 6.9 3.1 143 5.8 2.7 144 6.8 3.2 145 6.7 3.3 146 6.7 3.0 147 6.3 2.5 148 6.5 3.0 149 6.2 3.4 150 5.9 3.0
Adding a new variable
We want to add a variable called “Petal.Color” the
iris
data frame to record a new set of observations. Let’s first define a new data frame ‘iris.update’ (so as not to modify our original raw data).iris.update <- iris
Now, to initialize the variable, add a new character column to your data frame populated with no values to indicate that we have not recorded any observations.
Solution
We have a couple of options:
1) Define the vector and append it to the data frame using
cbind()
:Petal.Color <- character(length = dim(iris.update)[1]) # use the dim function to figure out how long to make the new vector iris.update <- cbind(iris.update, Petal.Color) head(iris.update)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Petal.Color 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa
2) Directly populate the new column while creating it:
iris.update$Petal.Color <- as.character("") head(iris.update)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Petal.Color 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa
There are also other ways to accomplish this task.
Updating a Subset of Values
Update the
iris.update
data frame by indicating that the “setosa” species had purple petals (without changing the values for the other species).Solution
iris.update[iris.update$Species == "setosa", ]$Petal.Color <- "purple" iris.update
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Petal.Color 1 5.1 3.5 1.4 0.2 setosa purple 2 4.9 3.0 1.4 0.2 setosa purple 3 4.7 3.2 1.3 0.2 setosa purple 4 4.6 3.1 1.5 0.2 setosa purple 5 5.0 3.6 1.4 0.2 setosa purple 6 5.4 3.9 1.7 0.4 setosa purple 7 4.6 3.4 1.4 0.3 setosa purple 8 5.0 3.4 1.5 0.2 setosa purple 9 4.4 2.9 1.4 0.2 setosa purple 10 4.9 3.1 1.5 0.1 setosa purple 11 5.4 3.7 1.5 0.2 setosa purple 12 4.8 3.4 1.6 0.2 setosa purple 13 4.8 3.0 1.4 0.1 setosa purple 14 4.3 3.0 1.1 0.1 setosa purple 15 5.8 4.0 1.2 0.2 setosa purple 16 5.7 4.4 1.5 0.4 setosa purple 17 5.4 3.9 1.3 0.4 setosa purple 18 5.1 3.5 1.4 0.3 setosa purple 19 5.7 3.8 1.7 0.3 setosa purple 20 5.1 3.8 1.5 0.3 setosa purple 21 5.4 3.4 1.7 0.2 setosa purple 22 5.1 3.7 1.5 0.4 setosa purple 23 4.6 3.6 1.0 0.2 setosa purple 24 5.1 3.3 1.7 0.5 setosa purple 25 4.8 3.4 1.9 0.2 setosa purple 26 5.0 3.0 1.6 0.2 setosa purple 27 5.0 3.4 1.6 0.4 setosa purple 28 5.2 3.5 1.5 0.2 setosa purple 29 5.2 3.4 1.4 0.2 setosa purple 30 4.7 3.2 1.6 0.2 setosa purple 31 4.8 3.1 1.6 0.2 setosa purple 32 5.4 3.4 1.5 0.4 setosa purple 33 5.2 4.1 1.5 0.1 setosa purple 34 5.5 4.2 1.4 0.2 setosa purple 35 4.9 3.1 1.5 0.2 setosa purple 36 5.0 3.2 1.2 0.2 setosa purple 37 5.5 3.5 1.3 0.2 setosa purple 38 4.9 3.6 1.4 0.1 setosa purple 39 4.4 3.0 1.3 0.2 setosa purple 40 5.1 3.4 1.5 0.2 setosa purple 41 5.0 3.5 1.3 0.3 setosa purple 42 4.5 2.3 1.3 0.3 setosa purple 43 4.4 3.2 1.3 0.2 setosa purple 44 5.0 3.5 1.6 0.6 setosa purple 45 5.1 3.8 1.9 0.4 setosa purple 46 4.8 3.0 1.4 0.3 setosa purple 47 5.1 3.8 1.6 0.2 setosa purple 48 4.6 3.2 1.4 0.2 setosa purple 49 5.3 3.7 1.5 0.2 setosa purple 50 5.0 3.3 1.4 0.2 setosa purple 51 7.0 3.2 4.7 1.4 versicolor 52 6.4 3.2 4.5 1.5 versicolor 53 6.9 3.1 4.9 1.5 versicolor 54 5.5 2.3 4.0 1.3 versicolor 55 6.5 2.8 4.6 1.5 versicolor 56 5.7 2.8 4.5 1.3 versicolor 57 6.3 3.3 4.7 1.6 versicolor 58 4.9 2.4 3.3 1.0 versicolor 59 6.6 2.9 4.6 1.3 versicolor 60 5.2 2.7 3.9 1.4 versicolor 61 5.0 2.0 3.5 1.0 versicolor 62 5.9 3.0 4.2 1.5 versicolor 63 6.0 2.2 4.0 1.0 versicolor 64 6.1 2.9 4.7 1.4 versicolor 65 5.6 2.9 3.6 1.3 versicolor 66 6.7 3.1 4.4 1.4 versicolor 67 5.6 3.0 4.5 1.5 versicolor 68 5.8 2.7 4.1 1.0 versicolor 69 6.2 2.2 4.5 1.5 versicolor 70 5.6 2.5 3.9 1.1 versicolor 71 5.9 3.2 4.8 1.8 versicolor 72 6.1 2.8 4.0 1.3 versicolor 73 6.3 2.5 4.9 1.5 versicolor 74 6.1 2.8 4.7 1.2 versicolor 75 6.4 2.9 4.3 1.3 versicolor 76 6.6 3.0 4.4 1.4 versicolor 77 6.8 2.8 4.8 1.4 versicolor 78 6.7 3.0 5.0 1.7 versicolor 79 6.0 2.9 4.5 1.5 versicolor 80 5.7 2.6 3.5 1.0 versicolor 81 5.5 2.4 3.8 1.1 versicolor 82 5.5 2.4 3.7 1.0 versicolor 83 5.8 2.7 3.9 1.2 versicolor 84 6.0 2.7 5.1 1.6 versicolor 85 5.4 3.0 4.5 1.5 versicolor 86 6.0 3.4 4.5 1.6 versicolor 87 6.7 3.1 4.7 1.5 versicolor 88 6.3 2.3 4.4 1.3 versicolor 89 5.6 3.0 4.1 1.3 versicolor 90 5.5 2.5 4.0 1.3 versicolor 91 5.5 2.6 4.4 1.2 versicolor 92 6.1 3.0 4.6 1.4 versicolor 93 5.8 2.6 4.0 1.2 versicolor 94 5.0 2.3 3.3 1.0 versicolor 95 5.6 2.7 4.2 1.3 versicolor 96 5.7 3.0 4.2 1.2 versicolor 97 5.7 2.9 4.2 1.3 versicolor 98 6.2 2.9 4.3 1.3 versicolor 99 5.1 2.5 3.0 1.1 versicolor 100 5.7 2.8 4.1 1.3 versicolor 101 6.3 3.3 6.0 2.5 virginica 102 5.8 2.7 5.1 1.9 virginica 103 7.1 3.0 5.9 2.1 virginica 104 6.3 2.9 5.6 1.8 virginica 105 6.5 3.0 5.8 2.2 virginica 106 7.6 3.0 6.6 2.1 virginica 107 4.9 2.5 4.5 1.7 virginica 108 7.3 2.9 6.3 1.8 virginica 109 6.7 2.5 5.8 1.8 virginica 110 7.2 3.6 6.1 2.5 virginica 111 6.5 3.2 5.1 2.0 virginica 112 6.4 2.7 5.3 1.9 virginica 113 6.8 3.0 5.5 2.1 virginica 114 5.7 2.5 5.0 2.0 virginica 115 5.8 2.8 5.1 2.4 virginica 116 6.4 3.2 5.3 2.3 virginica 117 6.5 3.0 5.5 1.8 virginica 118 7.7 3.8 6.7 2.2 virginica 119 7.7 2.6 6.9 2.3 virginica 120 6.0 2.2 5.0 1.5 virginica 121 6.9 3.2 5.7 2.3 virginica 122 5.6 2.8 4.9 2.0 virginica 123 7.7 2.8 6.7 2.0 virginica 124 6.3 2.7 4.9 1.8 virginica 125 6.7 3.3 5.7 2.1 virginica 126 7.2 3.2 6.0 1.8 virginica 127 6.2 2.8 4.8 1.8 virginica 128 6.1 3.0 4.9 1.8 virginica 129 6.4 2.8 5.6 2.1 virginica 130 7.2 3.0 5.8 1.6 virginica 131 7.4 2.8 6.1 1.9 virginica 132 7.9 3.8 6.4 2.0 virginica 133 6.4 2.8 5.6 2.2 virginica 134 6.3 2.8 5.1 1.5 virginica 135 6.1 2.6 5.6 1.4 virginica 136 7.7 3.0 6.1 2.3 virginica 137 6.3 3.4 5.6 2.4 virginica 138 6.4 3.1 5.5 1.8 virginica 139 6.0 3.0 4.8 1.8 virginica 140 6.9 3.1 5.4 2.1 virginica 141 6.7 3.1 5.6 2.4 virginica 142 6.9 3.1 5.1 2.3 virginica 143 5.8 2.7 5.1 1.9 virginica 144 6.8 3.2 5.9 2.3 virginica 145 6.7 3.3 5.7 2.5 virginica 146 6.7 3.0 5.2 2.3 virginica 147 6.3 2.5 5.0 1.9 virginica 148 6.5 3.0 5.2 2.0 virginica 149 6.2 3.4 5.4 2.3 virginica 150 5.9 3.0 5.1 1.8 virginica
Key Points
Lists are a standard data structure in R in which each element can contain any other R object.
Lists can contain elements of different classes, unlike vectors.
Data frames are a specific type of list in which all elements are vectors of the same length. Each vector can contain data of different classes.
Use
object[[x]]
to select a single element from a list.Each element of a list can be assigned a name that can be addressed using the
$
operator (e.g.mylist$element1
).Different indexing methods can be combined to efficiently extract desired data subsets for further analysis.