- Dataframes
- Lists
- Matrices
- Arrays
Jan 28, 2026
“…tightly coupled collections of variables…used as the fundamental data structure by most of R’s modeling software.”—data.frame() help page.
mtcars and trees datasets were dataframesdata.frameNA to help with this.Birds <- data.frame(Type = c("BirdofPrey", "BirdofPrey", "SongBird",
"Shorebird", "Flightless", "SongBird",
"Flightless"),
Num. = c(3, 15, 42, 8, 0, 29, 17),
Extinct = c(F,F,F,F,T,F,F),
row.names = c("Eagle", "Hawk", "Bluebird",
"Egret", "Dodo", "Blue Jay", "Kiwi"))
Copy the Birds data frame from Blackboard for the remaining exercises.
Birds
## Type Num. Extinct ## Eagle BirdofPrey 3 FALSE ## Hawk BirdofPrey 15 FALSE ## Bluebird SongBird 42 FALSE ## Egret Shorebird 8 FALSE ## Dodo Flightless 0 TRUE ## Blue Jay SongBird 29 FALSE ## Kiwi Flightless 17 FALSE
Did data.frame() do what we wanted?
median(Num.)
## Error: ## ! object 'Num.' not found
median(Birds$Num.)
## [1] 15
class(Birds$Type)
## [1] "character"
class(Birds$Num.)
## [1] "numeric"
It’s extremely useful to be able access the names of a data frame. For a data frame, names() or colnames() gives column or variable names. rownames() returns row names.
names(Birds)
## [1] "Type" "Num." "Extinct"
colnames(Birds)
## [1] "Type" "Num." "Extinct"
rownames(Birds)
## [1] "Eagle" "Hawk" "Bluebird" "Egret" "Dodo" "Blue Jay" "Kiwi"
Data frames are actually lists with equal length vectors. They’re just displayed differently.
class(Birds) # Can be user defined, but has defaults typeof(Birds) # Unchangable class(Birds)<-"BirdFrame" # Arbitrary, I just made that up.
## [1] "data.frame" ## [1] "list"
While data frames are probably the most common structure for temporary data storage and viewing, lists are probably most often used for more complex data. They are extremely flexible and useful, but a little tricky to work with.
Let’s look at our Birds data frame transformed to a list:
as.list(Birds)
## $Type ## [1] "BirdofPrey" "BirdofPrey" "SongBird" "Shorebird" "Flightless" ## [6] "SongBird" "Flightless" ## ## $Num. ## [1] 3 15 42 8 0 29 17 ## ## $Extinct ## [1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE ## ## attr(,"class") ## [1] "BirdFrame" ## attr(,"row.names") ## [1] "Eagle" "Hawk" "Bluebird" "Egret" "Dodo" "Blue Jay" "Kiwi"
It’s the same exact information, just not as pretty.
as. functions?as.vector ?as.numeric ?as.list # etc.
as.list(Birds)
An example:
as.numeric(Birds)
## Error: ## ! 'list' object cannot be coerced to type 'double'
Why doesn’t this work?
Let’s create our own list. First, find some values to use.
letters month.name pi seq(5, 95, by=10)
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" ## [20] "t" "u" "v" "w" "x" "y" "z" ## [1] "January" "February" "March" "April" "May" "June" ## [7] "July" "August" "September" "October" "November" "December" ## [1] 3.141593 ## [1] 5 15 25 35 45 55 65 75 85 95
Use list() to create a list and remember to use the “tag = value” structure.
list(alpha = letters, Months = month.name, PI = pi, fives = seq(5, 95, by=10))
## $alpha ## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" ## [20] "t" "u" "v" "w" "x" "y" "z" ## ## $Months ## [1] "January" "February" "March" "April" "May" "June" ## [7] "July" "August" "September" "October" "November" "December" ## ## $PI ## [1] 3.141593 ## ## $fives ## [1] 5 15 25 35 45 55 65 75 85 95
matrix()?matrix nine<-matrix(1:9, nrow=3) # by column by default nine
## [,1] [,2] [,3] ## [1,] 1 4 7 ## [2,] 2 5 8 ## [3,] 3 6 9
nine<-matrix(1:9, nrow=3, byrow=T) nine
## [,1] [,2] [,3] ## [1,] 1 2 3 ## [2,] 4 5 6 ## [3,] 7 8 9
nine<-matrix(as.character(1:9), nrow=3, byrow=T) # Characters are fine too nine
## [,1] [,2] [,3] ## [1,] "1" "2" "3" ## [2,] "4" "5" "6" ## [3,] "7" "8" "9"
nine<-matrix(1:9, nrow=3, byrow=T) # Change back to numeric nine
## [,1] [,2] [,3] ## [1,] 1 2 3 ## [2,] 4 5 6 ## [3,] 7 8 9
Trying to mix data types results in (typically) unwanted coersion.
nine[1,1]
## [1] 1
nine[1,1]<-"Ten" nine
## [,1] [,2] [,3] ## [1,] "Ten" "2" "3" ## [2,] "4" "5" "6" ## [3,] "7" "8" "9"
Another example, if we try to coerce our Birds data frame to a matrix, R changes a few things so that it becomes a valid matrix.
class(Birds)<-"data.frame" as.matrix(Birds)
## Type Num. Extinct ## Eagle "BirdofPrey" " 3" "FALSE" ## Hawk "BirdofPrey" "15" "FALSE" ## Bluebird "SongBird" "42" "FALSE" ## Egret "Shorebird" " 8" "FALSE" ## Dodo "Flightless" " 0" "TRUE" ## Blue Jay "SongBird" "29" "FALSE" ## Kiwi "Flightless" "17" "FALSE"
Similar to a data frame, you can name the rows and columns. In addition, you can name the dimensions, but the names have to be passed as a list.
nine<-matrix(1:9, nrow=3, byrow=T, dimnames=list(Foo=LETTERS[1:3], Bar=month.abb[1:3])) nine
## Bar ## Foo Jan Feb Mar ## A 1 2 3 ## B 4 5 6 ## C 7 8 9
Create arrays by passing a vector and giving a vector of dimensions to function array.
arr1<-array(1:(3^3), dim=rep(3,3),
dimnames=list(Foo=LETTERS[1:3],
Bar=month.abb[1:3],
Stooge=c("Larry", "Moe", "Curly")))
arr1
## , , Stooge = Larry ## ## Bar ## Foo Jan Feb Mar ## A 1 4 7 ## B 2 5 8 ## C 3 6 9 ## ## , , Stooge = Moe ## ## Bar ## Foo Jan Feb Mar ## A 10 13 16 ## B 11 14 17 ## C 12 15 18 ## ## , , Stooge = Curly ## ## Bar ## Foo Jan Feb Mar ## A 19 22 25 ## B 20 23 26 ## C 21 24 27
Make an array from the environmental heterogeneity values in the BCI.env dataset available from the vegan package.
library(vegan) #might have to install.packages("vegan") first
data(BCI.env)
head(BCI.env$EnvHet)
## [1] 0.6272 0.3936 0.0000 0.0000 0.4608 0.0768
length(BCI.env$EnvHet)
## [1] 50
array(BCI.env$EnvHet, dim=c(2,5,5))
## , , 1 ## ## [,1] [,2] [,3] [,4] [,5] ## [1,] 0.6272 0 0.4608 0.3808 0 ## [2,] 0.3936 0 0.0768 0.2112 0 ## ## , , 2 ## ## [,1] [,2] [,3] [,4] [,5] ## [1,] 0.4032 0.6624 0.0000 0.0000 0.0768 ## [2,] 0.0000 0.1472 0.4608 0.6592 0.2112 ## ## , , 3 ## ## [,1] [,2] [,3] [,4] [,5] ## [1,] 0.2688 0.6240 0.6080 0.0000 0.6528 ## [2,] 0.2112 0.4352 0.3648 0.3328 0.6144 ## ## , , 4 ## ## [,1] [,2] [,3] [,4] [,5] ## [1,] 0.4928 0.0768 0.3328 0.3648 0.0000 ## [2,] 0.7264 0.0000 0.4032 0.0000 0.6208 ## ## , , 5 ## ## [,1] [,2] [,3] [,4] [,5] ## [1,] 0.4032 0.0768 0.3424 0.3648 0.4992 ## [2,] 0.1472 0.5568 0.1472 0.4608 0.6368