MATH 3610

Biological Statistics

Datasets

Right-click on any file below to download it or copy the URL for use in R, Excel, or other software.

RMarkdown Starter Kit

Tip: Knit to HTML first — it causes far fewer problems than knitting directly to PDF. If you need a PDF, open the HTML file in your browser and print to PDF from there (File → Print → Save as PDF). Set your YAML output to html_document as shown in the skeleton below.

Every RMarkdown document begins with a YAML header (the block between the --- lines) and contains a mix of R code chunks and written commentary. Here is the minimum skeleton:

--- title: "Your Title" author: "Your Name" date: "`r Sys.Date()`" output: html_document --- ```{r setup, include=FALSE} library(dplyr) mydata <- read.csv("URL or filepath") ``` ## Question 1 ```{r} # Your R code here ``` **Your interpretation goes here.**

A few things to remember:

Project Templates

These starter files have the YAML header, data import, and question sections already set up. Open one in RStudio and fill in your code and interpretations.

Worked Exemplar

This completed RMarkdown file analyzes the Temperature and Heart Rate dataset and demonstrates what a finished project should look like — code, output, and written interpretations together.

R Command Reference

Commands are grouped by the modules where they are introduced. If you need a command from a later module, you may need to define a custom function first — see the project templates for those.

Modules 01–05: Data Manipulation and Descriptive Statistics

CommandWhat It Does
read.csv("url")Read a CSV file into a data frame
library(dplyr)Load the dplyr package for data manipulation
filter(df, condition)Subset rows that meet a condition
df$variableAccess a single column from a data frame
mean(x)Arithmetic mean
median(x)Median
sd(x)Standard deviation
var(x)Variance
cor(x, y)Correlation coefficient between two variables
length(x)Number of elements in a vector
sum(x)Sum of elements (also counts TRUE values in a logical vector)
c(a, b, c)Combine values into a vector
hist(x)Histogram
boxplot(x, horizontal=T)Boxplot (horizontal orientation)
plot(x, y)Scatterplot

Modules 06–08: Probability Distributions and Sampling

CommandWhat It Does
dbinom(x, n, p)Binomial probability: P(X = x)
pbinom(x, n, p)Cumulative binomial probability: P(X ≤ x)
qbinom(prob, n, p)Binomial quantile: value where P(X ≤ x) = prob
dpois(x, lambda)Poisson probability: P(X = x)
ppois(x, lambda)Cumulative Poisson probability: P(X ≤ x)
dgeom(x, p)Geometric probability: P(X = x)
pgeom(x, p)Cumulative geometric probability: P(X ≤ x)
punif(x, min, max)Cumulative uniform probability: P(X ≤ x)
qunif(prob, min, max)Uniform quantile
pnorm(x, mean, sd)Cumulative normal probability: P(X ≤ x)
qnorm(prob, mean, sd)Normal quantile: value where P(X ≤ x) = prob
sqrt(x)Square root

Module 09: Assessing Normality

CommandWhat It Does
stripchart(x, method="stack")Stacked dot plot for spotting outliers and skewness
qqnorm(x)Normal quantile plot — points should follow a line if data is normal
qqline(x)Add the reference line to a normal quantile plot
rnorm(n, mean, sd)Generate n random values from a normal distribution
sample(x, size)Draw a random sample of a given size from a vector

Modules 10–11: Hypothesis Testing and Multiple Testing

CommandWhat It Does
one.samp.t.test.sum(...)One-sample t-test from summary statistics (custom function — see project templates)
one.samp.t.test.data(...)One-sample t-test from raw data (custom function — see project templates)
one.samp.prop.test(...)One-sample proportion z-test (custom function — see project templates)
pt(t, df)Cumulative t-distribution probability
Ps * kBonferroni adjustment: multiply p-values by the number of tests

Module 12: Two-Sample Tests

CommandWhat It Does
t.test(x, y, alternative=...)Two-sample t-test (independent samples)
t.test(x, y, paired=TRUE)Paired t-test
prop.test(c(x1,x2), c(n1,n2))Two-sample proportion test

Module 13: Regression

CommandWhat It Does
lm(y ~ x1 + x2)Fit a linear (or multiple) regression model
summary(model)Coefficients, p-values, and R-squared for a fitted model
coefficients(model)Extract the regression coefficients
fitted(model)Predicted values from the model
resid(model)Residuals (observed − predicted)
plot(fitted(m), resid(m))Residual plot to check model assumptions

Common Errors

These are the errors students run into most often when knitting RMarkdown documents. If your document won't knit, start here.

Object not found

Error in mean(mam$gestation) : object 'mam' not found

Your data frame hasn't been created yet. This usually means you didn't run the setup chunk. In RStudio, click the green arrow on the setup chunk first, or knit the whole document (the setup chunk runs automatically when knitting).

Could not find function

Error in filter(mam, bodywt < 100) : could not find function "filter"

The package that contains this function hasn't been loaded. Make sure library(dplyr) is in your setup chunk and that the setup chunk has been run.

Unexpected end of input

Error: unexpected end of input

A code chunk is missing its closing ```. Scroll through your .Rmd file and make sure every chunk that opens with ```{r} has a matching ``` on its own line.

Unexpected symbol / Unexpected string constant

Error: unexpected symbol in "mean(mam$ gestation)"

There's a typo or extra space in your R code. In this example, the space between $ and gestation breaks the command. Check for stray spaces, missing commas, or unmatched parentheses.

Non-numeric argument to binary operator

Error in x * y : non-numeric argument to binary operator

You're trying to do math on something that isn't a number. This often happens when a column name is misspelled and R returns NULL instead of data. Double-check your variable names against the data frame.

Knitting produces a blank or incomplete document

(No error message — document just looks wrong)

Make sure your written text is outside code chunks. Text inside a chunk is treated as R code. Also check that you have a blank line before and after each ## heading.

Cannot open connection / Cannot open URL

Error in file(file, "rt") : cannot open the connection

R can't reach the data file. If you're loading from a URL, make sure you're connected to the internet and the URL is correct. If loading a local file, check that the file path is right and the file is in your working directory.

Plots not appearing in output

(No error — the plot just doesn't show up in the knitted document)

Make sure the plotting command is inside a code chunk. Also check that the chunk doesn't have include=FALSE or eval=FALSE in its header, which would suppress the output.