Jan 21, 2026

Data Classes and Structures

  • Logical, integer, double, character values
  • Vectors
  • Factors

Logical Values

  • already discussed logical operators which are…
  • … used in logical expressions which …
  • … yield logical values

Logical Values

is.logical(FALSE)
is.logical(1)
is.logical(!1)
## [1] TRUE
## [1] FALSE
## [1] TRUE

# Check out
?is.logical
# for more detail

More generally, you can check the class of an object with the function class().

a<-TRUE
class(a)
## [1] "logical"
e<-"stumpy"
class(e)
## [1] "character"

Characters

is.character("stumpy")
## [1] TRUE
is.character(2) # numeric
## [1] FALSE
is.character("2")
## [1] TRUE

Characters

is.character(TRUE) # logical
## [1] FALSE
is.character("TRUE")
## [1] TRUE
is.character(pi) # built-in constant
## [1] FALSE

Characters

is.character('stumpy') # single quotes
## [1] TRUE
is.character('is stumpy lumpy?') # OK with spaces and special characters
## [1] TRUE

Characters

You can do all sorts of stuff with characters you can’t do with other classes (e.g., regular expressions).

# Check out
?is.character
# for more detail

Numeric

Numeric classes can be complicated because they can take two main types themselves:

  • integer
  • double
  • Double values are typically numbers with a decimal point.
  • In R, 1 (no decimal) could be either an integer or a double.
  • By default R understands 1 as a double.
  • R understands 1.0 as a double value, always.

Double

is.double(1.0) 
## [1] TRUE
is.double(1) # Confusing.
## [1] TRUE

The weird “L”: coerces to integer

is.double(1L) # Even more confusing what's the 'L' all about?
## [1] FALSE
is.numeric(1L) # Makes some sense
## [1] TRUE

Explicitly defining or checking for double values is usually not that useful. You can usually tell from looking at the numbers if it’s double or not.

Integer

is.integer(1.0)
is.integer(1) # Need "L"
is.integer(1L)
is.integer(1:10) # implied integer vector because we used `:`
## [1] FALSE
## [1] FALSE
## [1] TRUE
## [1] TRUE

Vectors

Vectors

  • a data structure
  • simply a 1-dimensional series of values
  • composed of logical, character, or numeric values
  • values within the same vector all have to be the same class
x<-1:10
is.vector(x)
is.integer(x)
class(x)
## [1] TRUE
## [1] TRUE
## [1] "integer"

Creating vectors with c()

a<-c(T, T, F, F, F)
b<-c("Environ.", "Science", "and", "Forestry")
e<-c(1,2,3,4,5) # Gives double
e<-1:5 # Gives integer
d<-e + 0.25 # double
d
## [1] 1.25 2.25 3.25 4.25 5.25

Creating numeric vectors

You can create numeric vectors in a variety of ways. By far the most common are to use seq() and seq_along().

seq(from=0, to=100, by=10)
##  [1]   0  10  20  30  40  50  60  70  80  90 100
seq(0, 100, 10)
##  [1]   0  10  20  30  40  50  60  70  80  90 100
y<-10:5
seq_along(y)
## [1] 1 2 3 4 5 6

Creating numeric vectors

Can specify the number of values you want to produce instead.

seq(from=0, by=10, length.out=10)
##  [1]  0 10 20 30 40 50 60 70 80 90
  • “length” = length of the produced vector
  • “out” = length of output

Creating empty vectors

  • Use class-specific functions to create ‘empty’ vectors of a certain type:
logical(4)
character(5)
numeric(6)
integer(7)
## [1] FALSE FALSE FALSE FALSE
## [1] "" "" "" "" ""
## [1] 0 0 0 0 0 0
## [1] 0 0 0 0 0 0 0
  • Sometimes, you’ll create empty vectors just before they are filled because R often has a difficult time editting data structures that don’t exist yet.

Naming elements in vectors

y<-1:3
y
## [1] 1 2 3
names(y)<-c("Do", "Re", "Mi")
y
## Do Re Mi 
##  1  2  3
  • each value gets a name

Naming vectors

y<-1:3
names(y)<-c("Do", "Re", "Mi", "Fa") # Too many names
# Error in names(y) <- c("Do", "Re", "Mi", "Fa") : 
#   'names' attribute [4] must be the same length as the vector [3]

Two ways to use names()

Access

names(y)
## [1] "Do" "Re" "Mi"

Access & Assignment (Initial or Replacement)

names(y)<-c("Do", "Re", "Mi")
  • two ways to use most name-related functions: colnames(), rownames(), etc.

Factors

Examples of Factors

  • Treatment, Control
  • Placebo, Non-placebo
  • NO2 added vs No NO2 not added
  • Bachelor’s, Master’s, PhD
  • High, Medium, Low
  • 5, 50, 500 mg penicillin (could also be numeric)

Factor Properties

hml<-c("high", "low", "medium", "low", "medium", "high")
hml
## [1] "high"   "low"    "medium" "low"    "medium" "high"
class(hml)
## [1] "character"

Factor Properties

hml<-factor(hml)
class(hml)
## [1] "factor"

Levels of a factor are the unique labels given to factor values.

hml
## [1] high   low    medium low    medium high  
## Levels: high low medium

Each factor value is actually an integer that is linked to a level.

char<-as.character(rep(c("High", "Med", "Low"), each=100))
head(char)
## [1] "High" "High" "High" "High" "High" "High"
object.size(char)
## 2616 bytes
object.size(factor(char))
## 1832 bytes

To illustrate the links between the vector of integers and factor labels let’s coerce the factor to integers.

hml
## [1] high   low    medium low    medium high  
## Levels: high low medium
as.integer(hml)
## [1] 1 2 3 2 3 1

When we first used factor(), R assigned high to 1, low to 2, and medium to 3.

Factor levels can be accessed using the levels() function.

levels(hml) #Access the character values representing the levels
## [1] "high"   "low"    "medium"

Unordered Factors

hml
## [1] high   low    medium low    medium high  
## Levels: high low medium
table(hml)
## hml
##   high    low medium 
##      2      2      2

Unordered Factors

hml
## [1] high   low    medium low    medium high  
## Levels: high low medium
hml[1] > hml[2] # high should be greater than low
## Warning in Ops.factor(hml[1], hml[2]): '>' not meaningful for factors
## [1] NA

str()

str(hml) #Check the STRucture of hml
##  Factor w/ 3 levels "high","low","medium": 1 2 3 2 3 1

You can even have factor levels that aren’t represented in the data.

hml<-factor(hml, levels=c("low", "medium", "high", "super high"), ordered=T)
hml
## [1] high   low    medium low    medium high  
## Levels: low < medium < high < super high
table(hml)
## hml
##        low     medium       high super high 
##          2          2          2          0
#Store backup
hml.bu<-hml

You can’t combine new levels with c() like you can with a character, numeric, or logical vector.

hml<-c("super low", hml) # Unwanted coersion to character
hml
## [1] "super low" "3"         "1"         "2"         "1"         "2"        
## [7] "3"
class(hml)
## [1] "character"
hml<-hml.bu # Restore

Remember why this happens?

In hml, R does not consider “super low” to be a valid factor level in hml.

What’s a way to do this correctly?

hml<-factor(hml, levels=c("super low", levels(hml)), ordered=T)
hml
## [1] high   low    medium low    medium high  
## Levels: super low < low < medium < high < super high
hml<-hml.bu # Restore

Exercise: Milestones in Quantum Computing

Goal: practice working with factors by creating, modifying, and reordering factor levels.

1. Download the R code to create a data frame called quant_prog with three columns:
- Year: numeric vector (e.g., c(1980, 1981, 1994, 1996, 2007, 2012, 2019))
- Event: character vector briefly describing a milestone
- Category: character vector indicating whether it was a theoretical, algorithmic, or commercial milestone

head(quant_prog)
##   Year                                        Event   Category
## 1 1980   Feasibility of quantum computing described     Theory
## 2 1981       Simulating quantum processes in nature     Theory
## 3 1994        Shor's factoring algorithm introduced  Algorithm
## 4 1996         Grover's search algorithm introduced  Algorithm
## 5 2007 First 'commercial' quantum computer (D-Wave) Commercial
## 6 2012          Quantum supremacy concept described     Theory

Next,

2. Convert Category into a factor and check its current levels.

3. Change the levels so that they appear in the following order:
- “Theory”
- “Algorithm”
- “Commercial”

4. Create a summary table (e.g., using table()) to see how many entries belong to each category.

Solution

2. Convert Category into a factor and check its current levels.

# 2. Convert Category to a factor
quant_prog$Category <- factor(quant_prog$Category)
quant_prog$Category
## [1] Theory     Theory     Algorithm  Algorithm  Commercial Theory     Commercial
## Levels: Algorithm Commercial Theory
levels(quant_prog$Category)
## [1] "Algorithm"  "Commercial" "Theory"

Solution continued

3. Change the levels so that they appear in the following order:
- “Theory”
- “Algorithm”
- “Commercial”

# 3. Reorder the levels
quant_prog$Category <- factor(
  quant_prog$Category,
  levels = c("Theory", "Algorithm", "Commercial")
)
quant_prog$Category
## [1] Theory     Theory     Algorithm  Algorithm  Commercial Theory     Commercial
## Levels: Theory Algorithm Commercial
levels(quant_prog$Category)
## [1] "Theory"     "Algorithm"  "Commercial"

Solution continued

4. Create a summary table (e.g., using table()) to see how many entries belong to each category.

# 4. Summary table
table(quant_prog$Category)
## 
##     Theory  Algorithm Commercial 
##          3          2          2

Citations:

[1] https://www.nature.com/articles/s41597-022-01639-1 [2] https://en.wikipedia.org/wiki/Timeline_of_quantum_computing_and_communication [3] https://quantumcomputingforbusiness.com/essentials/timelines/ [4] https://www.flagshippioneering.com/timelines/quantum-computing-timeline [5] https://arxiv.org/html/2402.13352v1 [6] https://www.flagshippioneering.com/timelines/quantum-computing-timeline [7] http://quantumly.com/timeline-of-quantum-computing-history-of-quantum-computers-dates.html [8] https://hdsr.mitpress.mit.edu/pub/23gghb1v/release/3 [9] https://ntt-research.com/wp-content/uploads/2022/09/Forecasting-timelines-of-quantum-computing1.pdf [10] https://github.com/eperrier/QDataSet [11] https://pmc.ncbi.nlm.nih.gov/articles/PMC8581508/