R Data Classes

Jan 21, 2026

Data Classes and Structures

Logical, integer, double, character values
Vectors
Factors

Logical Values

already discussed logical operators which are…
… used in logical expressions which …
… yield logical values

Logical Values

is.logical(FALSE)
is.logical(1)
is.logical(!1)

## [1] TRUE
## [1] FALSE
## [1] TRUE

# Check out
?is.logical
# for more detail

More generally, you can check the class of an object with the function class().

a<-TRUE
class(a)

## [1] "logical"

e<-"stumpy"
class(e)

## [1] "character"

Characters

is.character("stumpy")

## [1] TRUE

is.character(2) # numeric

## [1] FALSE

is.character("2")

## [1] TRUE

Characters

is.character(TRUE) # logical

## [1] FALSE

is.character("TRUE")

## [1] TRUE

is.character(pi) # built-in constant

## [1] FALSE

Characters

is.character('stumpy') # single quotes

## [1] TRUE

is.character('is stumpy lumpy?') # OK with spaces and special characters

## [1] TRUE

Characters

You can do all sorts of stuff with characters you can’t do with other classes (e.g., regular expressions).

# Check out
?is.character
# for more detail

Numeric

Numeric classes can be complicated because they can take two main types themselves:

integer
double

Double values are typically numbers with a decimal point.
In R, 1 (no decimal) could be either an integer or a double.
By default R understands 1 as a double.
R understands 1.0 as a double value, always.

Double

is.double(1.0)

## [1] TRUE

is.double(1) # Confusing.

## [1] TRUE

The weird “L”: coerces to integer

is.double(1L) # Even more confusing what's the 'L' all about?

## [1] FALSE

is.numeric(1L) # Makes some sense

## [1] TRUE

Explicitly defining or checking for double values is usually not that useful. You can usually tell from looking at the numbers if it’s double or not.

Integer

is.integer(1.0)
is.integer(1) # Need "L"
is.integer(1L)
is.integer(1:10) # implied integer vector because we used `:`

## [1] FALSE
## [1] FALSE
## [1] TRUE
## [1] TRUE

Vectors

a data structure
simply a 1-dimensional series of values
composed of logical, character, or numeric values
values within the same vector all have to be the same class

x<-1:10
is.vector(x)
is.integer(x)
class(x)

## [1] TRUE
## [1] TRUE
## [1] "integer"

Creating vectors with `c()`

a<-c(T, T, F, F, F)
b<-c("Environ.", "Science", "and", "Forestry")
e<-c(1,2,3,4,5) # Gives double
e<-1:5 # Gives integer
d<-e + 0.25 # double
d

## [1] 1.25 2.25 3.25 4.25 5.25

Creating numeric vectors

You can create numeric vectors in a variety of ways. By far the most common are to use seq() and seq_along().

seq(from=0, to=100, by=10)

##  [1]   0  10  20  30  40  50  60  70  80  90 100

seq(0, 100, 10)

##  [1]   0  10  20  30  40  50  60  70  80  90 100

y<-10:5
seq_along(y)

## [1] 1 2 3 4 5 6

Creating numeric vectors

Can specify the number of values you want to produce instead.

seq(from=0, by=10, length.out=10)

##  [1]  0 10 20 30 40 50 60 70 80 90

“length” = length of the produced vector
“out” = length of output

Creating empty vectors

Use class-specific functions to create ‘empty’ vectors of a certain type:

logical(4)
character(5)
numeric(6)
integer(7)

## [1] FALSE FALSE FALSE FALSE
## [1] "" "" "" "" ""
## [1] 0 0 0 0 0 0
## [1] 0 0 0 0 0 0 0

Sometimes, you’ll create empty vectors just before they are filled because R often has a difficult time editting data structures that don’t exist yet.

Naming elements in vectors

y<-1:3
y

## [1] 1 2 3

names(y)<-c("Do", "Re", "Mi")
y

## Do Re Mi 
##  1  2  3

each value gets a name

Naming vectors

y<-1:3
names(y)<-c("Do", "Re", "Mi", "Fa") # Too many names
# Error in names(y) <- c("Do", "Re", "Mi", "Fa") : 
#   'names' attribute [4] must be the same length as the vector [3]

Two ways to use `names()`

Access

names(y)
## [1] "Do" "Re" "Mi"

Access & Assignment (Initial or Replacement)

names(y)<-c("Do", "Re", "Mi")

two ways to use most name-related functions: colnames(), rownames(), etc.

Factors

Examples of Factors

Treatment, Control
Placebo, Non-placebo
NO₂ added vs No NO₂ not added
Bachelor’s, Master’s, PhD
High, Medium, Low
5, 50, 500 mg penicillin (could also be numeric)

Factor Properties

hml<-c("high", "low", "medium", "low", "medium", "high")
hml

## [1] "high"   "low"    "medium" "low"    "medium" "high"

class(hml)

## [1] "character"

Factor Properties

hml<-factor(hml)
class(hml)

## [1] "factor"

Levels of a factor are the unique labels given to factor values.

hml

## [1] high   low    medium low    medium high  
## Levels: high low medium

Each factor value is actually an integer that is linked to a level.

char<-as.character(rep(c("High", "Med", "Low"), each=100))
head(char)

## [1] "High" "High" "High" "High" "High" "High"

object.size(char)

## 2616 bytes

object.size(factor(char))

## 1832 bytes

To illustrate the links between the vector of integers and factor labels let’s coerce the factor to integers.

hml

## [1] high   low    medium low    medium high  
## Levels: high low medium

as.integer(hml)

## [1] 1 2 3 2 3 1

When we first used factor(), R assigned high to 1, low to 2, and medium to 3.

Factor levels can be accessed using the levels() function.

levels(hml) #Access the character values representing the levels

## [1] "high"   "low"    "medium"

Unordered Factors

hml

## [1] high   low    medium low    medium high  
## Levels: high low medium

table(hml)

## hml
##   high    low medium 
##      2      2      2

Unordered Factors

hml

## [1] high   low    medium low    medium high  
## Levels: high low medium

hml[1] > hml[2] # high should be greater than low

## Warning in Ops.factor(hml[1], hml[2]): '>' not meaningful for factors

## [1] NA

`str()`

str(hml) #Check the STRucture of hml

##  Factor w/ 3 levels "high","low","medium": 1 2 3 2 3 1

You can even have factor levels that aren’t represented in the data.

hml<-factor(hml, levels=c("low", "medium", "high", "super high"), ordered=T)
hml

## [1] high   low    medium low    medium high  
## Levels: low < medium < high < super high

table(hml)

## hml
##        low     medium       high super high 
##          2          2          2          0

#Store backup
hml.bu<-hml

You can’t combine new levels with c() like you can with a character, numeric, or logical vector.

hml<-c("super low", hml) # Unwanted coersion to character
hml

## [1] "super low" "3"         "1"         "2"         "1"         "2"        
## [7] "3"

class(hml)

## [1] "character"

hml<-hml.bu # Restore

Remember why this happens?

In hml, R does not consider “super low” to be a valid factor level in hml.

What’s a way to do this correctly?

hml<-factor(hml, levels=c("super low", levels(hml)), ordered=T)
hml

## [1] high   low    medium low    medium high  
## Levels: super low < low < medium < high < super high

hml<-hml.bu # Restore

Exercise: Milestones in Quantum Computing

Goal: practice working with factors by creating, modifying, and reordering factor levels.

1. Download the R code to create a data frame called quant_prog with three columns:
- Year: numeric vector (e.g., c(1980, 1981, 1994, 1996, 2007, 2012, 2019))
- Event: character vector briefly describing a milestone
- Category: character vector indicating whether it was a theoretical, algorithmic, or commercial milestone

head(quant_prog)

##   Year                                        Event   Category
## 1 1980   Feasibility of quantum computing described     Theory
## 2 1981       Simulating quantum processes in nature     Theory
## 3 1994        Shor's factoring algorithm introduced  Algorithm
## 4 1996         Grover's search algorithm introduced  Algorithm
## 5 2007 First 'commercial' quantum computer (D-Wave) Commercial
## 6 2012          Quantum supremacy concept described     Theory

Next,

2. Convert Category into a factor and check its current levels.

3. Change the levels so that they appear in the following order:
- “Theory”
- “Algorithm”
- “Commercial”

4. Create a summary table (e.g., using table()) to see how many entries belong to each category.

Solution

2. Convert Category into a factor and check its current levels.

# 2. Convert Category to a factor
quant_prog$Category <- factor(quant_prog$Category)
quant_prog$Category

## [1] Theory     Theory     Algorithm  Algorithm  Commercial Theory     Commercial
## Levels: Algorithm Commercial Theory

levels(quant_prog$Category)

## [1] "Algorithm"  "Commercial" "Theory"

Solution continued

3. Change the levels so that they appear in the following order:
- “Theory”
- “Algorithm”
- “Commercial”

# 3. Reorder the levels
quant_prog$Category <- factor(
  quant_prog$Category,
  levels = c("Theory", "Algorithm", "Commercial")
)
quant_prog$Category

## [1] Theory     Theory     Algorithm  Algorithm  Commercial Theory     Commercial
## Levels: Theory Algorithm Commercial

levels(quant_prog$Category)

## [1] "Theory"     "Algorithm"  "Commercial"

Solution continued

4. Create a summary table (e.g., using table()) to see how many entries belong to each category.

# 4. Summary table
table(quant_prog$Category)

## 
##     Theory  Algorithm Commercial 
##          3          2          2

Citations:

[1] https://www.nature.com/articles/s41597-022-01639-1 [2] https://en.wikipedia.org/wiki/Timeline_of_quantum_computing_and_communication [3] https://quantumcomputingforbusiness.com/essentials/timelines/ [4] https://www.flagshippioneering.com/timelines/quantum-computing-timeline [5] https://arxiv.org/html/2402.13352v1 [6] https://www.flagshippioneering.com/timelines/quantum-computing-timeline [7] http://quantumly.com/timeline-of-quantum-computing-history-of-quantum-computers-dates.html [8] https://hdsr.mitpress.mit.edu/pub/23gghb1v/release/3 [9] https://ntt-research.com/wp-content/uploads/2022/09/Forecasting-timelines-of-quantum-computing1.pdf [10] https://github.com/eperrier/QDataSet [11] https://pmc.ncbi.nlm.nih.gov/articles/PMC8581508/

Data Classes and Structures

Logical Values

Logical Values

Characters

Characters

Characters

Characters

Numeric

Double

The weird “L”: coerces to integer

Integer

Vectors

Vectors

Creating vectors with c()

Creating numeric vectors

Creating numeric vectors

Creating empty vectors

Naming elements in vectors

Naming vectors

Two ways to use names()

Factors

Examples of Factors

Factor Properties

Factor Properties

Unordered Factors

Unordered Factors

str()

Exercise: Milestones in Quantum Computing

Next,

Solution

Solution continued

Solution continued

Citations:

Creating vectors with `c()`

Two ways to use `names()`

`str()`