NMstats is an R package that contains functions aligned to the Navidi/Monk Elementary & Essential Statistics textbooks. This vignette describes the functions available along with special considerations and examples.
Call the NMstats package:
library(NMstats)
The combs
function calculates the number of ways to
choose r
items from a total of n
without
replacement and where order does not matter.
Usage
The format is combs(n, r)
where n
is the
total number of items and r
is the number of items to
choose.
Return
The function returns an integer value that represents the number of
combinations of r
items chosen from n
. If
invalid input values are given, an error message is returned.
Examples
Count the number of combinations of 4 items chosen from 10
combs(10, 4)
## [1] 210
Computation of binomial probability where n = 15, p = 0.25, and x = 6.
n <- 15
p <- 0.25
x <- 6
prob_value <- combs(n, x) * p^x * (1 - p)^(n - x)
print(prob_value)
## [1] 0.09174777
The perms
function calculates the number of ways to
choose r
items from a total of n
without
replacement and where order matter.
Usage
The format is perms(n, r)
where n
is the
total number of items and r
is the number of items to
choose.
Return
The function returns an integer value that represents the number of
permutations of r
items chosen from n
. If
invalid input values are given, an error message is returned.
Examples
The number of permutations of 7 items chosen from 12.
perms(12, 7)
## [1] 3991680
Computation of the probability of guessing a 3-digit number (allowing zeros) if none of the digits are repeated.
num_outcomes <- perms(10, 3)
prob_value <- 1/num_outcomes
print(prob_value)
## [1] 0.001388889
The outlier_bounds
function computes the lower and upper
outlier boundaries using the IQR method.
Usage
The format is outlier_bounds(data)
where
data
is a data set.
Return
The function will print the lower and upper outlier bounds. Variables
Lower.bound
and Upper.bound
are available as
return values after execution of this function.
Examples
Print the lower and upper outlier bounds of a data set
data <- c(14, 9, 3, 22, 8, 13, 6)
outlier_bounds(data)
##
## Lower Outlier Bound: -2.75
## Upper Outlier Bound: 23.25
Calling lower and upper outlier bounds after
result <- outlier_bounds(data)
result$Lower.bound
## [1] -2.75
result$Upper.bound
## [1] 23.25
The data_range
function computes the range of a data
set.
Usage
The format is data_range(data)
where data
is a data set.
Return
The function returns the range of a data set, defined as the minimum subtracted from the maximum.
Examples
Return the range of a data set
data <- c(24, -67, 15, 89, -34, 51, -42, 76)
data_range(data)
## [1] 156
Comparing the ranges of two data sets
data_1 <- c(12.4, -5.3, 8.7, 19.1, 2.9)
data_2 <- c(-23.1, 7.5, 14.2, 3.8, -6.7)
data_range(data_1)
## [1] 24.4
data_range(data_2)
## [1] 37.3
The rel_hist
function constructs a relative frequency
histogram for a data set.
Usage
The format is
rel_hist(data, bins, col, xlab, ylab, main, ybreaks)
where
data
is a data set.
Optional arguments are:
bins
is the number of bins to usecol
is the fill colorxlab
and ylab
are the labels for the
axesmain
is the title of the graphybreaks
is a numeric vector specifying the tick marks
on the y-axis.
Return
The function constructs a relative frequency histogram.
Examples
Generate random data and construct relative histogram
data <- sample(1:100, 220, replace = TRUE)
rel_hist(data, bins = 20, xlab = "My Data")
The var.p
function calculates the population variance of
a data set.
Usage
The format is var.p(data)
where data
is a
data set.
Examples
Calculate the population variance of a data set.
data <- c(37, 292, 175, 86, 331, 249, 104, 58, 368, 213)
var.p(data)
## [1] 12523.21
The sd.p
function calculates the population standard
deviation of a data set.
Usage
The format is sd.p(data)
where data
is a
data set.
Examples
Calculate the population standard deviation of a data set.
data <- c(-3.2, 7.1, -12.5, 4.8, -0.6, 11.9, -9.4, 2.3, 6.7)
sd.p(data)
## [1] 7.563427
Computation of a z-score of a data value from a population
data <- c(3.1, 5.2, 6.3, 4.4, 2.8, 7.9, 1, 8.6, 0.7, 9.5, 4.1, 4)
x_val <- 2.8
# z-score
(x_val - mean(data))/sd.p(data)
## [1] -0.7386329
The Z_Interval
function calculates the confidence
interval for a population mean when the population standard deviation is
known.
Usage
The format is Z_Interval(xbar, n, sigma, alpha)
where
xbar
is the sample mean, n
is the sample size,
sigma
is the population standard deviation, and
alpha
is a value between 0 and 1 where
1 - alpha
is the confidence level as a decimal.
For example, if alpha = 0.02
, this represents a 98%
confidence level. If alpha
is omitted, a default value of
0.05 is used, representing a 95% confidence level for the interval.
Return
The function will print the confidence level, margin or error, critical value, and the confidence interval (lower and upper bounds).
Variables critval
, Lbound
, and
Ubound
are available as return values after execution of
this function.
If invalid input values are given, an error message is returned.
Examples
Construct a 90% confidence interval of a sample with size \(n = 36\) and mean \(\bar{x} = 39.8\) from a population with known standard deviation \(\sigma =6.4\).
Z_Interval(xbar = 39.8, n = 36, sigma = 6.4, alpha = 0.1)
## Confidence Level: 90 %
## Margin of Error: 1.75451
## Critical Value: 1.64485
## Lower Bound: 38.04549
## Upper Bound: 41.55451
Construct a 98% confidence interval of a sample that comes from a population with standard deviation \(\sigma = 12.8\).
# Define sample data
sample_data <- c(17.4, 28.9, 52.9, 29.3, 21.1, 16.5, 16.9, 16.2, 15.7, 14.2,
26.7, 38.6, 23.6, 24.3, 22.9, 25.9, 24.4, 28.3, 4.9, 27.5)
# Construct confidence interval
Z_Interval(xbar = mean(sample_data), n = length(sample_data), sigma = 12.8,
alpha = 0.02)
## Confidence Level: 98 %
## Margin of Error: 6.6584
## Critical Value: 2.32635
## Lower Bound: 17.1516
## Upper Bound: 30.4684
The T_Interval
function calculates the confidence
interval for a population mean when the population standard deviation is
not known, and the sample standard deviation is used instead. The
confidence interval is based on the Student’s t-distribution.
Usage
The format is T_Interval(xbar, n, s, alpha)
where
xbar
is the sample mean, n
is the sample size,
s
is the sample standard deviation, and alpha
is a value between 0 and 1 where 1 - alpha
is the
confidence level as a decimal.
For example, if alpha = 0.02
, this represents a 98%
confidence level. If alpha
is omitted, a default value of
0.05 is used, representing a 95% confidence level for the interval.
Return
The function will print the confidence level, margin or error, number of degrees of freedom, critical value, and the confidence interval (lower and upper bounds).
Variables critval
, Lbound
, and
Ubound
are available as return values after execution of
this function.
If invalid input values are given, an error message is returned.
Examples
Construct a 95% confidence interval of a sample with size \(n = 129\), and mean \(\bar{x} = 67.6\), and sample standard deviation \(s =18.2\).
T_Interval(xbar = 67.6, n = 129, s = 18.2, alpha = 0.05)
## Confidence Level: 95 %
## Margin of Error: 3.17066
## Number of Degrees of Freedom: 128
## Critical Value: 1.97867
## Lower Bound: 64.42934
## Upper Bound: 70.77066
Construct a 90% confidence interval of a sample.
# Define sample data
exam_grades <- c(64, 70, 67, 60, 74, 63, 73, 72, 66, 71, 60, 65, 65, 76,
56, 70, 68, 67, 67, 62, 63, 60, 79, 69, 75)
# Construct confidence interval
T_Interval(xbar = mean(exam_grades), n = length(exam_grades), s = sd(exam_grades),
alpha = 0.1)
## Confidence Level: 90 %
## Margin of Error: 1.9495
## Number of Degrees of Freedom: 24
## Critical Value: 1.71088
## Lower Bound: 65.3305
## Upper Bound: 69.2295
The One_Prop_Int
function calculates the confidence
interval for a population proportion. The confidence interval is based
on the normal distribution.
Usage
The format is One_Prop_Int(x, n, alpha)
where
x
is the number of individuals of interest in the sample,
n
is the sample size, and alpha
is a value
between 0 and 1 where 1 - alpha
is the confidence level as
a decimal.
For example, if alpha = 0.02
, this represents a 98%
confidence level. If alpha
is omitted, a default value of
0.05 is used, representing a 95% confidence level for the interval.
Return
The function will print the confidence level, margin or error, critical value, sample proportion, and the confidence interval (lower and upper bounds).
Variables sprop
, critval
,
Lbound
, and Ubound
are available as return
values after execution of this function.
If invalid input values are given, an error message is returned.
Examples
Construct a 92% confidence interval when \(x = 44\) and \(n = 192\).
One_Prop_Int(x = 44, n = 192, alpha = 0.08)
## Confidence Level: 92 %
## Margin of Error: 0.0531
## Critical Value: 1.75069
## Sample Proportion: 0.22917
## Lower Bound: 0.17606
## Upper Bound: 0.28227
Construct a 90% confidence interval for the population proportion of the number of “Yes” in a sample.
# Define sample data
sample_data <- c("No", "No", "No", "Yes", "Yes", "Yes", "No", "Yes", "No",
"No", "No", "No", "No", "Yes", "No", "No", "No", "Yes", "No", "Yes",
"No", "Yes", "Yes", "No", "No", "Yes", "Yes", "No", "No", "Yes", "Yes",
"No", "No", "No", "Yes", "No", "No", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "No", "No", "Yes", "No", "Yes", "Yes", "Yes", "No", "No",
"Yes", "No", "No", "No", "Yes", "No", "Yes")
# Summarize data into a table
table(sample_data)
## sample_data
## No Yes
## 32 28
# Extract yes's, unname, and assign to x
x <- unname(table(sample_data)["Yes"])
# Determine sample size
n <- length(sample_data)
# Explicitly state x (number of individuals of interest) and n (sample
# size)
x
## [1] 28
n
## [1] 60
# Construct 90% confidence interval
One_Prop_Int(x, n, alpha = 0.1)
## Confidence Level: 90 %
## Margin of Error: 0.10594
## Critical Value: 1.64485
## Sample Proportion: 0.46667
## Lower Bound: 0.36073
## Upper Bound: 0.57261
The Z_Test
function performs a hypothesis test about a
population mean when the population standard deviation is known.
P-values are calculated based on the normal distribution.
Usage
The format is Z_Test(xbar, n, sigma, mu, alt)
where
xbar
is the sample mean, n
is the sample size,
sigma
is the known population standard deviation,
mu
is the hypothesized mean in the null hypothesis, and
alt
is the form of the alternate hypothesis.
alt
is a character string with choices “left”, “right”,
or “two”. The choices “less”, “greater”, or “two.sided” are also
accepted.
Return
The function will print the test statistic z
and the
p-value.
Variables z
and pvalue
are available as
return values after execution of this function.
If invalid input values are given, an error message is returned.
Examples
Perform a hypothesis test of \(H_0: \mu = 50\) versus \(H_1: \mu < 50\) where \(\bar{x} = 47.3\), \(n = 44\), and \(\sigma = 10.2\).
Z_Test(xbar = 47.3, n = 44, sigma = 10.2, mu = 50, alt = "left")
## Test Statistic: z = -1.75586
## P-Value: 0.03956
Perform a hypothesis test of \(H_0: \mu = 67.9\) versus \(H_1: \mu \ne 67.9\) given a sample data set from a population with known standard deviation \(\sigma = 14\).
# Define sample data
my_data <- c(73, 92, 55, 85, 59, 87, 77, 52, 67, 88, 89, 54, 75, 68, 80,
90, 66, 78, 55, 86, 82, 63, 70, 84, 58, 53, 60, 91, 57, 71, 81, 69)
# Perform hypothesis test
Z_Test(xbar = mean(my_data), n = length(my_data), sigma = 14, mu = 67.9,
alt = "two")
## Test Statistic: z = 1.79555
## P-Value: 0.07257
The T_Test
function performs a hypothesis test about a
population mean when the population standard deviation is not known, and
the sample standard deviation is used instead. P-values are calculated
based on the Student’s t-distribution.
Usage
The format is T_Test(xbar, n, s, mu, alt)
where
xbar
is the sample mean, n
is the sample size,
s
is the sample standard deviation, mu
is the
hypothesized mean in the null hypothesis, and alt
is the
form of the alternate hypothesis.
alt
is a character string with choices “left”, “right”,
or “two”. The choices “less”, “greater”, or “two.sided” are also
accepted.
Return
The function will print the number of degrees of freedom, the test
statistic t
, and the p-value.
Variables t
and pvalue
are available as
return values after execution of this function.
If invalid input values are given, an error message is returned.
Examples
Perform a hypothesis test of \(H_0: \mu = 192\) versus \(H_1: \mu > 192\) where \(\bar{x} = 199.7\), \(n = 60\), and \(s = 41.8\).
T_Test(xbar = 199.7, n = 60, s = 41.8, mu = 192, alt = "right")
## Number of Degrees of Freedom: 59
## Test Statistic: t = 1.42689
## P-Value: 0.07944
Perform a hypothesis test of \(H_0: \mu = 71\) versus \(H_1: \mu \ne 71\) given a sample data set (assume population is normally distributed).
# Define sample data
my_data <- c(66, 71, 100, 76, 77, 102, 82, 55, 64, 68, 95, 81, 81, 77, 66,
104, 83, 44, 86, 67, 58, 72, 59, 63, 65, 48, 88, 77, 57, 95, 82, 70,
89, 89, 88, 86, 84, 74, 70, 69)
# Perform hypothesis test
T_Test(xbar = mean(my_data), n = length(my_data), s = sd(my_data), mu = 71,
alt = "two")
## Number of Degrees of Freedom: 39
## Test Statistic: t = 2.07723
## P-Value: 0.04441
The One_Prop_Test
function performs a one-sample
hypothesis test about a population proportion.
Usage
The format is One_Prop_Test(x, n, p0, alt)
where
x
is the number of individuals of interest in the sample,
n
is the sample size, p0
is the hypothesized
proportion in the null hypothesis, and alt
is the form of
the alternate hypothesis.
alt
is a character string with choices “left”, “right”,
or “two”. The choices “less”, “greater”, or “two.sided” are also
accepted.
Return
The function will print the sample proportion, the test statistic
z
, and the p-value.
Variables sprop
, z
and pvalue
are available as return values after execution of this function.
If invalid input values are given, an error message is returned.
Examples
Perform a hypothesis test of \(H_0: p = 0.5\) versus \(H_1: p < 0.5\) where \(x = 480\) and \(n = 1000\).
One_Prop_Test(x = 480, n = 1000, p0 = 0.5, alt = "left")
## Sample Proportion: 0.48
## Test Statistic: z = -1.26491
## P-Value: 0.10295
Perform a hypothesis test of \(H_0: p = 0.45\) versus \(H_1: p > 0.45\) for the population proportion of the number of “B” in a sample.
# Define sample data
samp <- c("A", "B", "B", "B", "B", "A", "A", "A", "A", "A", "B", "B", "A",
"B", "B", "B", "A", "A", "B", "B", "A", "A", "A", "B", "A", "B", "B",
"A", "A", "B", "B", "B", "A", "B", "A", "B", "B", "A", "B", "A", "B",
"B", "B", "A", "B", "B", "A", "B", "A", "B", "B", "B", "B", "A", "A",
"A", "B", "B", "B", "B")
# Summarize data into a table
table(samp)
## samp
## A B
## 25 35
# Extract B's, unname, and assign to x
x <- unname(table(samp)["B"])
# Determine sample size
n <- length(samp)
# Explicitly state x (number of individuals of interest) and n (sample
# size)
x
## [1] 35
n
## [1] 60
# Perform the hypothesis test
One_Prop_Test(x, n, p0 = 0.45, alt = "right")
## Sample Proportion: 0.58333
## Test Statistic: z = 2.076
## P-Value: 0.01895
The Two_Samp_T_Interval
function constructs a confidence
interval for the difference between two population means
(mu_1 - mu_2
) given two independent samples. This
confidence interval is based on the Student’s t-distribution.
Usage
The format is
Two_Samp_T_Interval(xbar1, s1, n1, xbar2, s2, n2, alpha)
where xbar1
is the mean of Sample 1, n1
is the
size of Sample 1, s1
is the standard deviation of Sample 1,
xbar2
is the mean of Sample 2, n2
is the size
of Sample 2, s2
is the standard deviation of Sample 2, and
alpha
is a value between 0 and 1 where
1 - alpha
is the confidence level as a decimal.
Return
The function will print the confidence level, margin of error, number of degrees of freedom, critical value, point estimate, and the confidence interval (lower and upper bounds).
Variables critval
, point_est
,
Lbound
, and Ubound
are available as return
values after execution of this function.
If invalid input values are given, an error message is returned.
Examples
Construct a 99% confidence interval for \(\mu_1 - \mu_2\) where \(\bar{x}_1 = 10\), \(s_1 = 2.5\), \(n_1 = 125\), \(\bar{x}_2 = 8.6\), \(s_2 = 3.4\), \(n_2 = 131\) and the samples are independent.
Two_Samp_T_Interval(xbar1 = 10, s1 = 2.5, n1 = 125, xbar2 = 8.6, s2 = 3.4,
n2 = 131, alpha = 0.01)
## Confidence Level: 99 %
## Margin of Error: 0.96544
## Number of Degrees of Freedom: 238.7094
## Critical Value: 2.59658
## Point Estimate: 1.4
## Lower Bound: 0.43456
## Upper Bound: 2.36544
Construct a 90% confidence interval for \(\mu_1 - \mu_2\) given two independent samples.
# Define Sample 1
exam_1 <- c(58, 67, 61, 62, 78, 47, 59, 72, 89, 68, 37, 81, 64, 64, 62, 70,
80, 67, 48, 68, 81, 84, 62, 76, 52, 92, 72, 62, 79, 61, 56, 61, 67, 76,
61, 65, 77, 62, 72, 71, 57, 55, 58, 44, 62, 83, 73, 55, 60, 74)
# Define Sample 2
exam_2 <- c(76, 47, 31, 35, 69, 42, 79, 60, 56, 74, 25, 69, 79, 37, 48, 58,
30, 54, 43, 57, 48, 62, 49, 44, 57, 61, 61, 56, 41, 31, 78, 59, 35, 31,
47, 47, 53, 69, 28, 76, 59, 51, 57, 38, 70, 64)
# Calculate sample statistics
xbar1 <- mean(exam_1)
n1 <- length(exam_1)
s1 <- sd(exam_1)
xbar2 <- mean(exam_2)
n2 <- length(exam_2)
s2 <- sd(exam_2)
# Construct confidence interval
Two_Samp_T_Interval(xbar1, s1, n1, xbar2, s2, n2, alpha = 0.1)
## Confidence Level: 90 %
## Margin of Error: 4.57168
## Number of Degrees of Freedom: 83.75668
## Critical Value: 1.66325
## Point Estimate: 13.17478
## Lower Bound: 8.6031
## Upper Bound: 17.74646
The Two_Samp_T_Test
function performs a hypothesis test
about the difference between two population means
(mu_1 - mu_2)
given two independent samples. P-values are
calculated based on the Student’s t-distribution.
Usage
The format is
Two_Samp_T_Test(xbar1, s1, n1, xbar2, s2, n2, alt, mu)
where xbar1
is the mean of Sample 1, n1
is the
size of Sample 1, s1
is the standard deviation of Sample 1,
xbar2
is the mean of Sample 2, n2
is the size
of Sample 2, s2
is the standard deviation of Sample 2,
mu
is the hypothesized difference in the null hypothesis,
and alt
is the form of the alternate hypothesis.
alt
is a character string with choices “left”, “right”,
or “two”. The choices “less”, “greater”, or “two.sided” are also
accepted.
Return
The function will print the number of degrees of freedom, the test
statistic t
, and the p-value.
Variables t
and pvalue
are available as
return values after execution of this function.
If invalid input values are given, an error message is returned.
Examples
Perform a hypothesis test for \(H_0: \mu_1 = \mu_2\) versus \(H_1: \mu_1 \ne \mu_2\) where \(\bar{x}_1 = 34.8\), \(s_1 = 5.2\), \(n_1 = 55\), \(\bar{x}_2 = 32.8\), \(s_2 = 3.8\), \(n_2 = 61\) and the samples are independent.
Two_Samp_T_Test(xbar1 = 34.8, s1 = 5.2, n1 = 55, xbar2 = 32.8, s2 = 3.8,
n2 = 61, alt = "two", mu = 0)
## Number of Degrees of Freedom: 98.06019
## Test Statistic: t = 2.34346
## P-Value: 0.02112
Perform a hypothesis test for \(H_0: \mu_1 = \mu_2\) versus \(H_1: \mu_1 < \mu_2\) given two independent samples.
# Define Sample 1
old_process <- c(1075, 1012, 980, 1035, 1001, 978, 965, 1030, 990, 1023,
1015, 998, 1075, 1008, 1085, 976, 1105, 1002, 987, 1021, 992, 1010, 1003,
1006, 1090, 988, 1001, 965, 1007, 973, 998, 982, 965, 1015, 987, 1000,
1089, 1109, 995, 1104, 1075, 1018, 1025, 981, 1077, 995)
# Define Sample 2
new_process <- c(1048, 966, 1029, 1098, 1064, 1035, 1042, 1056, 932, 1085,
1092, 943, 1060, 1045, 958, 1072, 1068, 1043, 1050, 1066, 1037, 1082,
1075, 1054, 929, 1089, 919, 923, 1069, 1080, 1046, 1078, 947, 1057, 1067,
925)
# Calculate sample statistics
xbar1 <- mean(old_process)
n1 <- length(old_process)
s1 <- sd(old_process)
xbar2 <- mean(new_process)
n2 <- length(new_process)
s2 <- sd(new_process)
# Perform hypothesis test
Two_Samp_T_Test(xbar1, s1, n1, xbar2, s2, n2, alt = "left", mu = 0)
## Number of Degrees of Freedom: 61.4347
## Test Statistic: t = -1.20701
## P-Value: 0.11603
The Two_Prop_Int
function calculates the confidence
interval for the difference between two population proportions
(p_1 - p_2)
.
Usage
The format is Two_Prop_Int(x1, n1, x2, n2, alpha)
where
x1
is the number of individuals of interest in Sample 1,
n1
is the size of Sample 1, x2
is the number
of individuals of interest in Sample 2, n2
is the size of
Sample 2, and and alpha
is a value between 0 and 1 where
1 - alpha
is the confidence level as a decimal.
Return
The function will print the confidence level, margin of error, critical value, point estimate, and the confidence interval (lower and upper bounds).
Variables sprop1
, sprop2
,
point_est
, critval
, Lbound
, and
Ubound
are available as return values after execution of
this function.
If invalid input values are given, an error message is returned.
Examples
Construct a 95% confidence interval for \(p_1 - p_2\) where \(x_1 = 74\), \(n_1 = 285\), \(x_2 = 168\), and \(n_2 = 400\).
Two_Prop_Int(x1 = 74, n1 = 285, x2 = 168, n2 = 400, alpha = 0.05)
## Confidence Level: 95 %
## Margin of Error: 0.07022
## Critical Value: 1.95996
## Point Estimate: -0.16035
## Lower Bound: -0.23057
## Upper Bound: -0.09013
Construct a 90% confidence interval for \(p_1 - p_2\) for the difference in the number of “Yes” in Sample 1 and Sample 2.
# Define Sample 1
samp1 <- c("No", "Yes", "Yes", "No", "Yes", "Yes", "Yes", "No", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "No", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "No", "Yes", "Yes", "Yes",
"Yes", "Yes", "No", "Yes", "Yes", "Yes")
# Define Sample 2
samp2 <- c("No", "No", "Yes", "No", "No", "Yes", "No", "Yes", "Yes", "Yes",
"No", "No", "No", "No", "Yes", "Yes", "Yes", "No", "Yes", "No", "No",
"No", "Yes", "Yes", "Yes", "No", "No", "No", "Yes", "No", "No", "No",
"No", "No", "No", "Yes", "No", "No", "Yes", "Yes")
# Summarize Sample 1 data into a table
table(samp1)
## samp1
## No Yes
## 6 30
# Summarize Sample 2 data into a table
table(samp2)
## samp2
## No Yes
## 24 16
# Extract Yes's, unname, and assign to x1 in Sample 1
x1 <- unname(table(samp1)["Yes"])
# Extract Yes's, unname, and assign to x2 in Sample 2
x2 <- unname(table(samp2)["Yes"])
# Determine sample size for Sample 1
n1 <- length(samp1)
# Determine sample size for Sample 2
n1 <- length(samp2)
# Explicitly state x1 (individuals of interest in Sample 1), n1 (Sample
# 1 size), x2 (individuals of interest in Sample 2), n2 (Sample 2
# size),
x1
## [1] 30
n1
## [1] 40
x2
## [1] 16
n2
## [1] 36
# Construct confidence interval
Two_Prop_Int(x1, n1, x2, n2, alpha = 0.1)
## Confidence Level: 90 %
## Margin of Error: 0.17674
## Critical Value: 1.64485
## Point Estimate: 0.30556
## Lower Bound: 0.12881
## Upper Bound: 0.4823
The Sign_Test
function performs a one-sample Sign Test
for a population median when the population distribution is not
necessarily normal. Critical values are included based on specified
alpha levels and alternate hypotheses.
Usage
The format is Sign_Test(sample, m0, alpha, alt)
where
sample
is a numeric vector of the sample data,
m0
is the hypothesized median, alpha
is the
significance level. Choices for one-tailed tests include 0.005, 0.01,
0.025, 0.05. Choices for two-tailed tests include 0.01, 0.02, 0.05,
0.10. alt
is the form of the alternate hypothesis.
alt
is a character string with choices “left”, “right”,
or “two”. The choices “less”, “greater”, or “two.sided” are also
accepted.
Return
The function will print detailed output including the null and alternate hypotheses, significance level, test statistic, critical value, and test result.
Examples
Perform a hypothesis test for \(H_0: m = 3\) versus \(H_1: m < 3\) at the \(\alpha = 0.05\) level of significance given a sample of data.
# Define sample data
data <- c(2.93, 2.95, 2.76, 2.89, 2.57, 3.06, 2.61, 2.66, 2.98, 2.79, 2.96,
2.74)
# Perform the test
Sign_Test(data, m0 = 3, alpha = 0.05, alt = "less")
## Null Hypothesis: H0: m = 3
## Alternate Hypothesis: H1: m < 3
## Significance Level: 0.05
## Test Statistic: 1
## Critical Value: 2
## Result: Reject the Null Hypothesis
Perform a hypothesis test for \(H_0: m = 170\) versus \(H_1: m < 170\) at the \(\alpha = 0.01\) level of significance given a sample of data.
# Define sample data
data <- c(149, 144, 218, 153, 134, 152, 148, 144, 178, 107, 199, 135, 171,
110, 160, 119, 86, 127, 106, 153, 169, 153, 153, 173, 156, 145, 205,
132, 169, 174, 130, 175)
# Perform the test
Sign_Test(data, m0 = 170, alpha = 0.01, alt = "less")
## Null Hypothesis: H0: m = 170
## Alternate Hypothesis: H1: m < 170
## Significance Level: 0.01
## Test Statistic: -2.65165
## Critical Value: -2.32635
## Result: Reject the Null Hypothesis
The Rank_Sum_Test
function performs a nonparametric test
for comparing the medians of two populations. The test requires that the
populations have approximately the same shape, so the test is sometimes
described as a test to determine whether two populations differ.
Usage
The format is
Rank_Sum_Test(sample1, sample2, alpha, alt)
where
sample1
and sample2
are numeric vectors of the
sample data, alpha
is the significance level,
alt
is the form of the alternate hypothesis.
alt
is a character string with choices “left”, “right”,
or “two”. The choices “less”, “greater”, or “two.sided” are also
accepted.
Return
The function will print detailed output including the null and alternate hypotheses, significance level, test statistic, p-value, and test result.
Examples
Decide whether there is a difference bewteen the median scores for two samples at the \(\alpha = 0.05\) level of significance.
# Define sample data
sample_1 <- c(78, 82, 83, 87, 75, 63, 78, 60, 94, 62, 98, 90, 97, 81)
sample_2 <- c(73, 72, 92, 100, 74, 90, 64, 84, 77, 89, 70, 64)
# Perform the test
Rank_Sum_Test(sample_1, sample_2, alpha = 0.05, alt = "two")
## Null Hypothesis: m1 = m2
## Alternate Hypothesis: m1 is not equal to m2
## Significance Level: 0.05
## Test Statistic: -0.38576
## P-Value: 0.69968
## Result: Do Not Reject the Null Hypothesis
The Signed_Rank_Test
function performs the nonparametric
Signed Rank Test for testing whether there is a difference between the
medians of two populations, when the data are in the form of paired
samples.
Usage
The format is Signed_Rank_Test(sample1, sample2, alpha)
where sample1
and sample2
are numeric vectors
of the sample data, and alpha
is the significance level.
Choices for alpha include 0.01, 0.02, 0.05, 0.10.
Return
The function will print detailed output including the null and alternate hypotheses, significance level, test statistic, critical value, and test result.
Examples
Decide whether there is a difference bewteen the median scores for two samples at the \(\alpha = 0.05\) level of significance.
# Define sample data
sample_1 <- c(283, 299, 274, 284, 248, 275, 293, 277)
sample_2 <- c(290, 281, 262, 287, 253, 287, 267, 271)
# Perform the test
Signed_Rank_Test(sample_1, sample_2, alpha = 0.05)
## Null Hypothesis: md = 0
## Alternate Hypothesis: md is not equal to 0
## Significance Level: 0.05
## Test Statistic: 12.5
## Critical Value: 4
## Result: Do Not Reject the Null Hypothesis