Complete all exercises below by filling in your answers in the designated spaces. When finished:
Due: Wednesday, January 28 by 11:59 PM
R has several fundamental data classes: logical, character, and
numeric (which includes integer and double). You can check an object’s
class with class() or test for a specific class with
functions like is.logical(), is.character(),
and is.numeric().
For each of the following, first predict the class, then use
class() to verify:
TRUE"TRUE"3.14159100L1:5In your own words, explain the difference between TRUE
and "TRUE":
Delete this text and write your explanation here.
There are several ways to create vectors in R:
c() combines values into a vector: creates a sequence of integersseq() creates a sequence with more control over the
outputCreate the following vectors using the most appropriate method:
length.out)By default, R treats whole numbers as doubles (floating-point
numbers), not integers. To explicitly create an integer, you append
L to the number.
Run the following code and explain the output:
is.integer(5)
is.integer(5L)
is.double(5)
is.double(5L)
Why does is.integer(5) return FALSE even
though 5 is clearly a whole number?
Delete this text and write your explanation here.
You can assign names to vector elements using the
names() function. This makes it easier to understand what
each value represents.
Create a vector called temps containing the values 32,
72, 98, and 212. Then assign the names “freezing”, “room”, “body”, and
“boiling” to the elements. Finally, print the named vector.
Factors are used to represent categorical data in R. They look like character vectors but are stored as integers with associated levels. This distinction matters for statistical modeling and plotting.
The built-in mtcars dataset contains a column
cyl representing the number of cylinders in each car’s
engine. Although stored as numeric, this is really categorical data
(cars come with 4, 6, or 8 cylinders - not 5.5).
Load mtcars and examine the cyl column.
What class is it currently?
Create a new variable cyl_factor by converting
mtcars$cyl to a factor.
Use levels() to see the factor levels. Use
table() to count how many cars have each number of
cylinders.
Use str() to examine the structure of your factor.
Notice how it’s stored as integers (1, 2, 3) that map to the
levels.
In your own words, why might it be useful to treat number of cylinders as a factor rather than a numeric variable?
Delete this text and write your explanation here.
Factor levels have an order, which affects how they appear in tables, plots, and model output. By default, R orders levels alphabetically (for characters) or numerically.
Consider tree size categories: “Large”, “Medium”, “Small”. Alphabetical order would display them as Large, Medium, Small - which happens to be backwards from a logical small-to-large progression.
Create a character vector called sizes with these
values: “Medium”, “Small”, “Large”, “Small”, “Medium”, “Large”, “Large”,
“Small”, “Medium”, “Large”
Convert sizes to a factor and use
levels() to check the order. Notice they’re
alphabetical.
Use table() to count how many trees are in each size
category. Notice the table follows alphabetical order.
Recreate the factor with levels in a logical order: “Small”,
“Medium”, “Large”. Hint: use the levels argument inside
factor().
Run table() again and confirm the output now
displays in your specified order.
Why might the order of factor levels matter when creating plots or running statistical models?
Delete this text and write your explanation here.
Before submitting, make sure you have:
This homework is worth 6 points total.