Introduction to T-Tests

A t-test is a parametric statistical test used to compare means when the population standard deviation is unknown. T-tests are based on the Student’s t-distribution and are the most commonly used statistical tests in practice. They’re used when:

  • Sample size is small to moderate (n < 30) or unknown population variance
  • Population standard deviation (σ) is unknown
  • You’re comparing one or two sample means
  • The population is approximately normally distributed
  • Data are approximately normally distributed or sample size is large

T-tests are fundamental to statistical inference and are widely applied in business, medicine, education, and research.


T-Distribution vs Normal Distribution

Key Differences

Aspect Normal Distribution T-Distribution
Shape Fixed, symmetric bell curve Bell curve, more spread out
Tails Lighter (thinner) Heavier (fatter)
Peak Higher at center Lower at center
Degrees of Freedom N/A df = n - 1
As n increases N/A Approaches normal distribution
Use when σ known σ unknown

Why Use T-Distribution?

When σ is unknown and estimated from sample data (s), using the normal distribution underestimates variability. The t-distribution has heavier tails to account for this uncertainty. As sample size increases, the t-distribution approaches the normal distribution.

Key Point: For df > 30, the t-distribution is very similar to the normal distribution.


Section 1: One-Sample T-Test

Purpose and Hypothesis

The one-sample t-test tests whether a sample mean differs significantly from a hypothesized population mean.

Hypotheses:

  • H₀: μ = μ₀ (Null: population mean equals hypothesized value)
  • H₁: μ ≠ μ₀ (Two-tailed alternative: means differ)
  • H₁: μ > μ₀ (Right-tailed alternative)
  • H₁: μ < μ₀ (Left-tailed alternative)

Assumptions

  1. Random sample: Data are randomly selected
  2. Normality: Population is normally distributed OR sample is large (n ≥ 30)
  3. Independence: Observations are independent
  4. Unknown σ: Population standard deviation is unknown

Formula

$$t = \frac{\overline{x} - \mu_0}{s/\sqrt{n}}$$

Where:

  • $\overline{x}$ = Sample mean
  • μ₀ = Hypothesized population mean
  • s = Sample standard deviation
  • n = Sample size
  • df = n - 1 (degrees of freedom)

Step-by-Step Procedure

Step 1: State Hypotheses

  • Define H₀ and H₁
  • Specify significance level (α = 0.05)

Step 2: Check Assumptions

  • Random sample?
  • Normal or large sample?
  • Independence verified?

Step 3: Calculate Test Statistic

  • Compute: mean, standard deviation, standard error
  • Calculate t-statistic

Step 4: Find P-Value

  • Use t-distribution table with df = n - 1
  • Two-tailed: P-value = 2 × P(t > |t_calc|)
  • One-tailed: P-value = P(t > t_calc)

Step 5: Decision and Conclusion

  • If P-value ≤ α: Reject H₀
  • If P-value > α: Fail to reject H₀

Example: One-Sample T-Test

Problem: A coffee shop claims their cappuccinos average 250mL. A customer measures 12 cappuccinos with mean 245mL and sample SD = 8mL. Test at α = 0.05 whether the true mean differs from 250mL.

Solution:

Given:

  • n = 12, $\overline{x}$ = 245mL, μ₀ = 250mL, s = 8mL, α = 0.05
  • df = 12 - 1 = 11

Standard error: $$SE = \frac{s}{\sqrt{n}} = \frac{8}{\sqrt{12}} = \frac{8}{3.464} = 2.309$$

T-statistic: $$t = \frac{245 - 250}{2.309} = \frac{-5}{2.309} = -2.166$$

Critical values (two-tailed, df = 11, α = 0.05): t = ±2.201

Since |t| = 2.166 < 2.201, fail to reject H₀

Conclusion: There is insufficient evidence that the mean differs from 250mL (p ≈ 0.053). The difference could be due to random variation.

Confidence Interval for One-Sample T-Test

$$\text{CI} = \overline{x} \pm t_{\alpha/2} \times \frac{s}{\sqrt{n}}$$

For the cappuccino example: $$\text{95% CI} = 245 \pm 2.201 \times 2.309 = 245 \pm 5.08 = (239.92, 250.08)$$

Since 250 is within the confidence interval, we fail to reject H₀.


Section 2: Paired T-Test (Dependent Samples)

Purpose and Hypothesis

The paired t-test (also called dependent samples t-test) compares means from the same subjects measured twice or from matched pairs.

Common Scenarios:

  • Before-after measurements on same subjects
  • Twins or matched pairs
  • Repeated measures
  • Pre-test and post-test on same group

Hypotheses:

  • H₀: μ_d = 0 (Mean difference is zero)
  • H₁: μ_d ≠ 0 (Mean difference is non-zero)

Assumptions

  1. Paired data: Observations are paired/matched
  2. Normal differences: Differences are normally distributed
  3. Independence: Pairs are independent of each other
  4. Unknown σ: Population SD of differences is unknown

Formula

$$t = \frac{\overline{d} - 0}{s_d/\sqrt{n}}$$

Where:

  • $\overline{d}$ = Mean of differences (d = x₁ - x₂)
  • s_d = Standard deviation of differences
  • n = Number of pairs
  • df = n - 1

Step-by-Step Procedure

Step 1: Calculate Differences

  • For each pair: d = Before - After (or Treatment - Control)
  • List all differences

Step 2: Summarize Differences

  • Mean: $\overline{d}$
  • Standard deviation: s_d
  • Sample size: n

Step 3: Calculate Test Statistic

  • Apply paired t-test formula

Step 4: Find P-Value

  • Use t-distribution with df = n - 1

Step 5: Decision

  • Compare p-value to α

Example: Paired T-Test

Problem: A fitness trainer measures weight (kg) for 10 clients before and after a 12-week program:

Client Before After Difference
1 85 82 -3
2 92 88 -4
3 78 75 -3
4 88 84 -4
5 95 91 -4
6 80 79 -1
7 90 86 -4
8 83 81 -2
9 87 84 -3
10 91 88 -3

Test at α = 0.05 whether the program significantly reduces weight.

Solution:

From data:

  • n = 10, $\overline{d}$ = -3.1 kg
  • s_d = 1.197 kg
  • df = 10 - 1 = 9

Standard error: $$SE_d = \frac{1.197}{\sqrt{10}} = \frac{1.197}{3.162} = 0.378$$

T-statistic: $$t = \frac{-3.1 - 0}{0.378} = \frac{-3.1}{0.378} = -8.201$$

Critical value (left-tailed, df = 9, α = 0.05): t = -1.833

Since t = -8.201 < -1.833, reject H₀

Conclusion: The program significantly reduces weight (p < 0.001). Mean weight loss is 3.1 kg.

Confidence Interval for Paired T-Test

$$\text{CI} = \overline{d} \pm t_{\alpha/2} \times \frac{s_d}{\sqrt{n}}$$

For the fitness example: $$\text{95% CI} = -3.1 \pm 2.262 \times 0.378 = -3.1 \pm 0.855 = (-3.955, -2.245)$$

We’re 95% confident the true mean weight loss is between 2.245 and 3.955 kg.


Section 3: Two-Sample T-Test (Independent Samples)

Purpose and Hypothesis

The two-sample t-test compares means from two independent populations.

Hypotheses:

  • H₀: μ₁ = μ₂ (Population means are equal)
  • H₁: μ₁ ≠ μ₂ (Population means differ)
  • H₁: μ₁ > μ₂ (One group mean greater)
  • H₁: μ₁ < μ₂ (One group mean less)

Assumptions

  1. Independence: Samples from two different groups
  2. Normality: Both populations normally distributed OR large samples
  3. Unknown σ’s: Both population standard deviations unknown
  4. Independence of observations: No pairing

Two Scenarios: Equal vs Unequal Variances

Scenario A: Equal Variances (Assuming σ₁ = σ₂)

Use this when:

  • Similar group sizes
  • Sample variances are similar (Levene’s test p > 0.05)

Formula: $$t = \frac{(\overline{x}_1 - \overline{x}_2) - (\mu_1 - \mu_2)}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$$

Where pooled variance: $$s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}$$

Degrees of freedom: $$df = n_1 + n_2 - 2$$

Scenario B: Unequal Variances (Welch’s T-Test)

Use this when:

  • Different group sizes
  • Sample variances are very different
  • Less assumptions about equal variances

Formula (Welch’s): $$t = \frac{\overline{x}_1 - \overline{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$$

Degrees of freedom (Welch-Satterthwaite equation): $$df = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}}$$

Example: Two-Sample T-Test (Equal Variances)

Problem: Compare test scores between two teaching methods:

  • Method A: n₁ = 25, $\overline{x}_1$ = 78, s₁ = 8
  • Method B: n₂ = 25, $\overline{x}_2$ = 72, s₂ = 8.5
  • Test at α = 0.05 whether methods differ

Solution:

Pooled variance: $$s_p^2 = \frac{(25-1) \times 64 + (25-1) \times 72.25}{25 + 25 - 2} = \frac{1536 + 1734}{48} = 68.125$$

Pooled SD: $s_p = 8.255$

Standard error: $$SE = 8.255 \times \sqrt{\frac{1}{25} + \frac{1}{25}} = 8.255 \times 0.283 = 2.336$$

T-statistic: $$t = \frac{78 - 72}{2.336} = \frac{6}{2.336} = 2.569$$

df = 25 + 25 - 2 = 48 Critical values (two-tailed, α = 0.05): t = ±2.010

Since |t| = 2.569 > 2.010, reject H₀

Conclusion: The teaching methods significantly differ (p = 0.013).

Example: Welch’s T-Test (Unequal Variances)

Problem: Compare recovery times between treatments:

  • Treatment A: n₁ = 15, $\overline{x}_1$ = 10 days, s₁ = 2
  • Treatment B: n₂ = 20, $\overline{x}_2$ = 12 days, s₂ = 5
  • Different variances and sizes; use Welch’s test

Solution:

Standard error: $$SE = \sqrt{\frac{4}{15} + \frac{25}{20}} = \sqrt{0.267 + 1.25} = \sqrt{1.517} = 1.232$$

T-statistic: $$t = \frac{10 - 12}{1.232} = \frac{-2}{1.232} = -1.624$$

df (Welch-Satterthwaite) ≈ 24

Critical values (two-tailed, df ≈ 24, α = 0.05): t = ±2.064

Since |t| = 1.624 < 2.064, fail to reject H₀

Conclusion: Insufficient evidence that recovery times differ (p ≈ 0.118).


Section 4: Checking Assumptions

Normality Testing

Q-Q Plot Method

  • Plot sample quantiles vs theoretical normal quantiles
  • Points close to diagonal line suggest normality
  • Deviations at tails indicate non-normality

Shapiro-Wilk Test

  • Null hypothesis: Data are normally distributed
  • If p > 0.05: Assume normality
  • If p ≤ 0.05: Data may not be normal

Solution if Not Normal

  • Large sample (n ≥ 30)? Central Limit Theorem applies; t-test is robust
  • Small sample? Consider transforming data or using non-parametric test (Mann-Whitney U)

Homogeneity of Variance (Levene’s Test)

For two-sample tests:

  • H₀: Variances are equal
  • H₁: Variances are not equal

Decision:

  • p > 0.05: Assume equal variances → Use standard t-test
  • p ≤ 0.05: Use Welch’s t-test

Independence Verification

Checklist:

  • ✓ Random sampling used?
  • ✓ Samples from different groups/time periods?
  • ✓ Observations within sample are independent?
  • ✓ No repeated measures for same subject?

Section 5: Effect Size for T-Tests

Cohen’s d

Standardized measure of difference between means:

$$d = \frac{\overline{x}_1 - \overline{x}_2}{s_p}$$

For one-sample test: $$d = \frac{\overline{x} - \mu_0}{s}$$

Interpretation (Cohen’s Guidelines)

Effect Size Interpretation
|d| ≈ 0.2 Small effect
|d| ≈ 0.5 Medium effect
|d| ≈ 0.8 Large effect
|d| > 1.0 Very large effect

Example: Cohen’s d

From teaching methods example: $$d = \frac{78 - 72}{8.255} = \frac{6}{8.255} = 0.727$$

This is a medium to large effect size, indicating the difference is practically meaningful.

Reporting

Always report alongside p-value:

  • “t(48) = 2.569, p = 0.013, d = 0.73”

This shows the difference is statistically significant AND practically meaningful.


Section 6: Confidence Intervals

One-Sample T-Test CI

$$\text{CI} = \overline{x} \pm t_{\alpha/2} \times \frac{s}{\sqrt{n}}$$

Two-Sample T-Test CI (Equal Variances)

$$\text{CI} = (\overline{x}_1 - \overline{x}2) \pm t{\alpha/2} \times s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}$$

Two-Sample T-Test CI (Unequal Variances)

$$\text{CI} = (\overline{x}_1 - \overline{x}2) \pm t{\alpha/2} \times \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$$

Paired T-Test CI

$$\text{CI} = \overline{d} \pm t_{\alpha/2} \times \frac{s_d}{\sqrt{n}}$$

Interpretation

A 95% confidence interval means: “If we repeated this study 100 times, approximately 95 of the resulting confidence intervals would contain the true population parameter.”


Section 7: Common Applications

Education: Test Score Comparison

  • Compare average scores between teaching methods
  • Assess whether tutoring improves performance

Medicine: Treatment Efficacy

  • Compare blood pressure before/after treatment
  • Compare recovery time between two treatments

Quality Control: Manufacturing

  • Test if machine produces specified dimensions
  • Compare quality between suppliers

Psychology: Behavioral Interventions

  • Measure anxiety scores before/after therapy
  • Compare depression levels between treatment groups

Section 8: T-Test Decision Tree

Do you have one or two samples?
│
├─→ ONE SAMPLE
│   │
│   └─→ Compare sample mean to population value?
│       └─→ ONE-SAMPLE T-TEST
│           (Use when σ unknown)
│
└─→ TWO SAMPLES
    │
    ├─→ Are samples paired/matched?
    │   ├─→ YES: PAIRED T-TEST
    │   │
    │   └─→ NO: Proceed
    │
    └─→ Are variances equal?
        ├─→ YES or SIMILAR:
        │   Standard two-sample t-test
        │   (Assumes equal variances)
        │
        └─→ NO or UNEQUAL:
            Welch's t-test
            (Does not assume equal variances)

Section 9: Reporting T-Test Results

Standard Format

“Participants in the treatment group (M = 85.2, SD = 12.4) scored significantly higher than the control group (M = 78.6, SD = 13.1), t(58) = 2.34, p = 0.022, d = 0.61.”

Components

  1. Descriptive statistics: M (mean), SD (standard deviation)
  2. Test name: t-test (or Welch’s, paired)
  3. Degrees of freedom: (df)
  4. Test statistic: t = value
  5. P-value: p = value
  6. Effect size: d = value

APA Format

Two-sample t-test (assuming equal variances)
t(48) = 2.569, p = 0.013, d = 0.73
95% CI [1.26, 10.74]

Common Mistakes and How to Avoid Them

Mistake 1: Using T-Test Instead of Paired T-Test

Problem: Treating paired data as independent Solution: Check if observations are paired or matched before selecting test

Mistake 2: Ignoring Unequal Variances

Problem: Using standard t-test with very different sample variances Solution: Use Levene’s test; apply Welch’s test if variances unequal

Mistake 3: Multiple Comparisons Without Correction

Problem: Conducting multiple t-tests without adjusting significance level Solution: Use Bonferroni correction or ANOVA for 3+ groups

Mistake 4: Assuming Normality Not Tested

Problem: Skipping normality assessment for small samples Solution: Always test or visualize normality, especially n < 30

Mistake 5: Misinterpreting Insignificant Results

Problem: Concluding “no difference exists” from p > 0.05 Solution: State “insufficient evidence” and consider statistical power


Interactive T-Test Calculator

[Calculator would be embedded here with tools for:]

  • One-sample t-test
  • Paired t-test
  • Two-sample t-test (equal variances)
  • Welch’s t-test (unequal variances)
  • Critical value finder
  • P-value calculator
  • Confidence interval calculator

Summary Comparison Table

Aspect One-Sample Paired Two-Sample (Equal σ) Two-Sample (Unequal σ)
Samples 1 2 (paired) 2 (independent) 2 (independent)
Test Statistic $(x̄ - μ₀)/(s/√n)$ $(d̄)/(s_d/√n)$ Pooled variance formula Separate variance formula
df n - 1 n - 1 $n_1 + n_2 - 2$ Welch-Satterthwaite
Assumption - Differences normal Both σ equal -
Use Levene’s? No No Yes Yes

Key Formulas Cheat Sheet

One-Sample T-Test

$$t = \frac{\overline{x} - \mu_0}{s/\sqrt{n}}, \quad df = n-1$$

Paired T-Test

$$t = \frac{\overline{d}}{s_d/\sqrt{n}}, \quad df = n-1$$

Two-Sample T-Test (Equal Variances)

$$t = \frac{\overline{x}_1 - \overline{x}_2}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}, \quad s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}$$

Two-Sample T-Test (Unequal Variances - Welch’s)

$$t = \frac{\overline{x}_1 - \overline{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$$

Cohen’s d

$$d = \frac{\overline{x}_1 - \overline{x}_2}{s_p} \text{ or } d = \frac{\overline{x} - \mu_0}{s}$$

Confidence Interval

$$\text{CI} = \text{(mean difference)} \pm t_{\alpha/2} \times SE$$


Alternative Comparison Tests:

Foundational Concepts:

Advanced Topics:

  • ANOVA - Multiple group comparisons


Next Steps

After mastering t-tests:

  1. ANOVA: Compare 3+ groups simultaneously
  2. Chi-Square Tests: Test categorical data
  3. Effect Sizes & Power Analysis: Understand practical significance and plan studies
  4. Non-Parametric Alternatives: When assumptions are violated

References

  1. Anderson, D.R., Sweeney, D.J., & Williams, T.A. (2018). Statistics for Business and Economics (14th ed.). Cengage Learning. - Detailed coverage of t-tests, assumptions, and practical applications in business analysis.

  2. Montgomery, D.C., & Runger, G.C. (2018). Applied Statistics for Engineers and Scientists (6th ed.). John Wiley & Sons. - Applications of t-tests in engineering and scientific experiments, including one-sample, paired, and two-sample designs.

  3. Walpole, R.E., Myers, S.L., Myers, S.L., & Ye, K. (2012). Probability & Statistics for Engineers & Scientists (9th ed.). Pearson. - Theoretical foundation and properties of the t-distribution and student’s t-tests.