Programming in R: Supplemental Lessons

Subsetting Lists and Matrices

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • How do I select a subsection of my list or matrix

Objectives
  • Learn how to select subsections of lists and matrices by absolute reference and logical operators

This lesson is a continuation of the core lesson [Subsetting Data] (https://carriebrown.github.io/r-novice-gapminder/06-data-subsetting/) ## Matrix subsetting Matrices are also subsetted using the `[` function. In this case it takes two arguments: the first applying to the rows, the second to its columns: ~~~ set.seed(1) m <- matrix(rnorm(6*4), ncol=4, nrow=6) m[3:4, c(3,1)] ~~~ {: .r} ~~~ [,1] [,2] [1,] 1.12493092 -0.8356286 [2,] -0.04493361 1.5952808 ~~~ {: .output} You can leave the first or second arguments blank to retrieve all the rows or columns respectively: ~~~ m[, c(3,4)] ~~~ {: .r} ~~~ [,1] [,2] [1,] -0.62124058 0.82122120 [2,] -2.21469989 0.59390132 [3,] 1.12493092 0.91897737 [4,] -0.04493361 0.78213630 [5,] -0.01619026 0.07456498 [6,] 0.94383621 -1.98935170 ~~~ {: .output} If we only access one row or column, R will automatically convert the result to a vector: ~~~ m[3,] ~~~ {: .r} ~~~ [1] -0.8356286 0.5757814 1.1249309 0.9189774 ~~~ {: .output} If you want to keep the output as a matrix, you need to specify a *third* argument; `drop = FALSE`: ~~~ m[3, , drop=FALSE] ~~~ {: .r} ~~~ [,1] [,2] [,3] [,4] [1,] -0.8356286 0.5757814 1.124931 0.9189774 ~~~ {: .output} Unlike vectors, if we try to access a row or column outside of the matrix, R will throw an error: ~~~ m[, c(3,6)] ~~~ {: .r} ~~~ Error in m[, c(3, 6)]: subscript out of bounds ~~~ {: .error} > ## Tip: Higher dimensional arrays > > when dealing with multi-dimensional arrays, each argument to `[` > corresponds to a dimension. For example, a 3D array, the first three > arguments correspond to the rows, columns, and depth dimension. > {: .callout} Because matrices are vectors, we can also subset using only one argument: ~~~ m[5] ~~~ {: .r} ~~~ [1] 0.3295078 ~~~ {: .output} This usually isn't useful, and often confusing to read. However it is useful to note that matrices are laid out in *column-major format* by default. That is the elements of the vector are arranged column-wise: ~~~ matrix(1:6, nrow=2, ncol=3) ~~~ {: .r} ~~~ [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 ~~~ {: .output} If you wish to populate the matrix by row, use `byrow=TRUE`: ~~~ matrix(1:6, nrow=2, ncol=3, byrow=TRUE) ~~~ {: .r} ~~~ [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6 ~~~ {: .output} Matrices can also be subsetted using their rownames and column names instead of their row and column indices. > ## Challenge 4 > > Given the following code: > > > ~~~ > m <- matrix(1:18, nrow=3, ncol=6) > print(m) > ~~~ > {: .r} > > > > ~~~ > [,1] [,2] [,3] [,4] [,5] [,6] > [1,] 1 4 7 10 13 16 > [2,] 2 5 8 11 14 17 > [3,] 3 6 9 12 15 18 > ~~~ > {: .output} > > 1. Which of the following commands will extract the values 11 and 14? > > A. `m[2,4,2,5]` > > B. `m[2:5]` > > C. `m[4:5,2]` > > D. `m[2,c(4,5)]` > > > ## Solution to challenge 4 > > > > D > {: .solution} {: .challenge} ## List subsetting Now we'll introduce some new subsetting operators. There are three functions used to subset lists. `[`, as we've seen for atomic vectors and matrices, as well as `[[` and `$`. Using `[` will always return a list. If you want to *subset* a list, but not *extract* an element, then you will likely use `[`. ~~~ xlist <- list(a = "Software Carpentry", b = 1:10, data = head(iris)) xlist[1] ~~~ {: .r} ~~~ $a [1] "Software Carpentry" ~~~ {: .output} This returns a *list with one element*. We can subset elements of a list exactly the same was as atomic vectors using `[`. Comparison operations however won't work as they're not recursive, they will try to condition on the data structures in each element of the list, not the individual elements within those data structures. ~~~ xlist[1:2] ~~~ {: .r} ~~~ $a [1] "Software Carpentry" $b [1] 1 2 3 4 5 6 7 8 9 10 ~~~ {: .output} To extract individual elements of a list, you need to use the double-square bracket function: `[[`. ~~~ xlist[[1]] ~~~ {: .r} ~~~ [1] "Software Carpentry" ~~~ {: .output} Notice that now the result is a vector, not a list. You can't extract more than one element at once: ~~~ xlist[[1:2]] ~~~ {: .r} ~~~ Error in xlist[[1:2]]: subscript out of bounds ~~~ {: .error} Nor use it to skip elements: ~~~ xlist[[-1]] ~~~ {: .r} ~~~ Error in xlist[[-1]]: attempt to select more than one element in get1index ~~~ {: .error} But you can use names to both subset and extract elements: ~~~ xlist[["a"]] ~~~ {: .r} ~~~ [1] "Software Carpentry" ~~~ {: .output} The `$` function is a shorthand way for extracting elements by name: ~~~ xlist$data ~~~ {: .r} ~~~ Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa ~~~ {: .output} > ## Challenge 5 > Given the following list: > > > ~~~ > xlist <- list(a = "Software Carpentry", b = 1:10, data = head(iris)) > ~~~ > {: .r} > > Using your knowledge of both list and vector subsetting, extract the number 2 from xlist. > Hint: the number 2 is contained within the "b" item in the list. > > > ## Solution to challenge 5 > > > > > > ~~~ > > xlist$b[2] > > ~~~ > > {: .r} > > > > > > > > ~~~ > > [1] 2 > > ~~~ > > {: .output} > > > > ~~~ > > xlist[[2]][2] > > ~~~ > > {: .r} > > > > > > > > ~~~ > > [1] 2 > > ~~~ > > {: .output} > > > > ~~~ > > xlist[["b"]][2] > > ~~~ > > {: .r} > > > > > > > > ~~~ > > [1] 2 > > ~~~ > > {: .output} > {: .solution} {: .challenge} > ## Challenge 6 > Given a linear model: > > > ~~~ > mod <- aov(pop ~ lifeExp, data=gapminder) > ~~~ > {: .r} > > Extract the residual degrees of freedom (hint: `attributes()` will help you) > > > ## Solution to challenge 6 > > > > > > ~~~ > > attributes(mod) ## `df.residual` is one of the names of `mod` > > ~~~ > > {: .r} > > > > ~~~ > > mod$df.residual > > ~~~ > > {: .r} > {: .solution} {: .challenge}

Key Points

  • Matrices and lists can be subsetted using the [ function.

  • Subsetting a list with the [ function will return a list.

  • To extract a single object from the list, use the [[ function.