Introduction to T-Tests
A t-test is a parametric statistical test used to compare means when the population standard deviation is unknown. T-tests are based on the Student’s t-distribution and are the most commonly used statistical tests in practice. They’re used when:
- Sample size is small to moderate (n < 30) or unknown population variance
- Population standard deviation (σ) is unknown
- You’re comparing one or two sample means
- The population is approximately normally distributed
- Data are approximately normally distributed or sample size is large
T-tests are fundamental to statistical inference and are widely applied in business, medicine, education, and research.
T-Distribution vs Normal Distribution
Key Differences
| Aspect | Normal Distribution | T-Distribution |
|---|---|---|
| Shape | Fixed, symmetric bell curve | Bell curve, more spread out |
| Tails | Lighter (thinner) | Heavier (fatter) |
| Peak | Higher at center | Lower at center |
| Degrees of Freedom | N/A | df = n - 1 |
| As n increases | N/A | Approaches normal distribution |
| Use when | σ known | σ unknown |
Why Use T-Distribution?
When σ is unknown and estimated from sample data (s), using the normal distribution underestimates variability. The t-distribution has heavier tails to account for this uncertainty. As sample size increases, the t-distribution approaches the normal distribution.
Key Point: For df > 30, the t-distribution is very similar to the normal distribution.
Section 1: One-Sample T-Test
Purpose and Hypothesis
The one-sample t-test tests whether a sample mean differs significantly from a hypothesized population mean.
Hypotheses:
- H₀: μ = μ₀ (Null: population mean equals hypothesized value)
- H₁: μ ≠ μ₀ (Two-tailed alternative: means differ)
- H₁: μ > μ₀ (Right-tailed alternative)
- H₁: μ < μ₀ (Left-tailed alternative)
Assumptions
- Random sample: Data are randomly selected
- Normality: Population is normally distributed OR sample is large (n ≥ 30)
- Independence: Observations are independent
- Unknown σ: Population standard deviation is unknown
Formula
$$t = \frac{\overline{x} - \mu_0}{s/\sqrt{n}}$$
Where:
- $\overline{x}$ = Sample mean
- μ₀ = Hypothesized population mean
- s = Sample standard deviation
- n = Sample size
- df = n - 1 (degrees of freedom)
Step-by-Step Procedure
Step 1: State Hypotheses
- Define H₀ and H₁
- Specify significance level (α = 0.05)
Step 2: Check Assumptions
- Random sample?
- Normal or large sample?
- Independence verified?
Step 3: Calculate Test Statistic
- Compute: mean, standard deviation, standard error
- Calculate t-statistic
Step 4: Find P-Value
- Use t-distribution table with df = n - 1
- Two-tailed: P-value = 2 × P(t > |t_calc|)
- One-tailed: P-value = P(t > t_calc)
Step 5: Decision and Conclusion
- If P-value ≤ α: Reject H₀
- If P-value > α: Fail to reject H₀
Example: One-Sample T-Test
Problem: A coffee shop claims their cappuccinos average 250mL. A customer measures 12 cappuccinos with mean 245mL and sample SD = 8mL. Test at α = 0.05 whether the true mean differs from 250mL.
Solution:
Given:
- n = 12, $\overline{x}$ = 245mL, μ₀ = 250mL, s = 8mL, α = 0.05
- df = 12 - 1 = 11
Standard error: $$SE = \frac{s}{\sqrt{n}} = \frac{8}{\sqrt{12}} = \frac{8}{3.464} = 2.309$$
T-statistic: $$t = \frac{245 - 250}{2.309} = \frac{-5}{2.309} = -2.166$$
Critical values (two-tailed, df = 11, α = 0.05): t = ±2.201
Since |t| = 2.166 < 2.201, fail to reject H₀
Conclusion: There is insufficient evidence that the mean differs from 250mL (p ≈ 0.053). The difference could be due to random variation.
Confidence Interval for One-Sample T-Test
$$\text{CI} = \overline{x} \pm t_{\alpha/2} \times \frac{s}{\sqrt{n}}$$
For the cappuccino example: $$\text{95% CI} = 245 \pm 2.201 \times 2.309 = 245 \pm 5.08 = (239.92, 250.08)$$
Since 250 is within the confidence interval, we fail to reject H₀.
Section 2: Paired T-Test (Dependent Samples)
Purpose and Hypothesis
The paired t-test (also called dependent samples t-test) compares means from the same subjects measured twice or from matched pairs.
Common Scenarios:
- Before-after measurements on same subjects
- Twins or matched pairs
- Repeated measures
- Pre-test and post-test on same group
Hypotheses:
- H₀: μ_d = 0 (Mean difference is zero)
- H₁: μ_d ≠ 0 (Mean difference is non-zero)
Assumptions
- Paired data: Observations are paired/matched
- Normal differences: Differences are normally distributed
- Independence: Pairs are independent of each other
- Unknown σ: Population SD of differences is unknown
Formula
$$t = \frac{\overline{d} - 0}{s_d/\sqrt{n}}$$
Where:
- $\overline{d}$ = Mean of differences (d = x₁ - x₂)
- s_d = Standard deviation of differences
- n = Number of pairs
- df = n - 1
Step-by-Step Procedure
Step 1: Calculate Differences
- For each pair: d = Before - After (or Treatment - Control)
- List all differences
Step 2: Summarize Differences
- Mean: $\overline{d}$
- Standard deviation: s_d
- Sample size: n
Step 3: Calculate Test Statistic
- Apply paired t-test formula
Step 4: Find P-Value
- Use t-distribution with df = n - 1
Step 5: Decision
- Compare p-value to α
Example: Paired T-Test
Problem: A fitness trainer measures weight (kg) for 10 clients before and after a 12-week program:
| Client | Before | After | Difference |
|---|---|---|---|
| 1 | 85 | 82 | -3 |
| 2 | 92 | 88 | -4 |
| 3 | 78 | 75 | -3 |
| 4 | 88 | 84 | -4 |
| 5 | 95 | 91 | -4 |
| 6 | 80 | 79 | -1 |
| 7 | 90 | 86 | -4 |
| 8 | 83 | 81 | -2 |
| 9 | 87 | 84 | -3 |
| 10 | 91 | 88 | -3 |
Test at α = 0.05 whether the program significantly reduces weight.
Solution:
From data:
- n = 10, $\overline{d}$ = -3.1 kg
- s_d = 1.197 kg
- df = 10 - 1 = 9
Standard error: $$SE_d = \frac{1.197}{\sqrt{10}} = \frac{1.197}{3.162} = 0.378$$
T-statistic: $$t = \frac{-3.1 - 0}{0.378} = \frac{-3.1}{0.378} = -8.201$$
Critical value (left-tailed, df = 9, α = 0.05): t = -1.833
Since t = -8.201 < -1.833, reject H₀
Conclusion: The program significantly reduces weight (p < 0.001). Mean weight loss is 3.1 kg.
Confidence Interval for Paired T-Test
$$\text{CI} = \overline{d} \pm t_{\alpha/2} \times \frac{s_d}{\sqrt{n}}$$
For the fitness example: $$\text{95% CI} = -3.1 \pm 2.262 \times 0.378 = -3.1 \pm 0.855 = (-3.955, -2.245)$$
We’re 95% confident the true mean weight loss is between 2.245 and 3.955 kg.
Section 3: Two-Sample T-Test (Independent Samples)
Purpose and Hypothesis
The two-sample t-test compares means from two independent populations.
Hypotheses:
- H₀: μ₁ = μ₂ (Population means are equal)
- H₁: μ₁ ≠ μ₂ (Population means differ)
- H₁: μ₁ > μ₂ (One group mean greater)
- H₁: μ₁ < μ₂ (One group mean less)
Assumptions
- Independence: Samples from two different groups
- Normality: Both populations normally distributed OR large samples
- Unknown σ’s: Both population standard deviations unknown
- Independence of observations: No pairing
Two Scenarios: Equal vs Unequal Variances
Scenario A: Equal Variances (Assuming σ₁ = σ₂)
Use this when:
- Similar group sizes
- Sample variances are similar (Levene’s test p > 0.05)
Formula: $$t = \frac{(\overline{x}_1 - \overline{x}_2) - (\mu_1 - \mu_2)}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$$
Where pooled variance: $$s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}$$
Degrees of freedom: $$df = n_1 + n_2 - 2$$
Scenario B: Unequal Variances (Welch’s T-Test)
Use this when:
- Different group sizes
- Sample variances are very different
- Less assumptions about equal variances
Formula (Welch’s): $$t = \frac{\overline{x}_1 - \overline{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$$
Degrees of freedom (Welch-Satterthwaite equation): $$df = \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{(s_1^2/n_1)^2}{n_1-1} + \frac{(s_2^2/n_2)^2}{n_2-1}}$$
Example: Two-Sample T-Test (Equal Variances)
Problem: Compare test scores between two teaching methods:
- Method A: n₁ = 25, $\overline{x}_1$ = 78, s₁ = 8
- Method B: n₂ = 25, $\overline{x}_2$ = 72, s₂ = 8.5
- Test at α = 0.05 whether methods differ
Solution:
Pooled variance: $$s_p^2 = \frac{(25-1) \times 64 + (25-1) \times 72.25}{25 + 25 - 2} = \frac{1536 + 1734}{48} = 68.125$$
Pooled SD: $s_p = 8.255$
Standard error: $$SE = 8.255 \times \sqrt{\frac{1}{25} + \frac{1}{25}} = 8.255 \times 0.283 = 2.336$$
T-statistic: $$t = \frac{78 - 72}{2.336} = \frac{6}{2.336} = 2.569$$
df = 25 + 25 - 2 = 48 Critical values (two-tailed, α = 0.05): t = ±2.010
Since |t| = 2.569 > 2.010, reject H₀
Conclusion: The teaching methods significantly differ (p = 0.013).
Example: Welch’s T-Test (Unequal Variances)
Problem: Compare recovery times between treatments:
- Treatment A: n₁ = 15, $\overline{x}_1$ = 10 days, s₁ = 2
- Treatment B: n₂ = 20, $\overline{x}_2$ = 12 days, s₂ = 5
- Different variances and sizes; use Welch’s test
Solution:
Standard error: $$SE = \sqrt{\frac{4}{15} + \frac{25}{20}} = \sqrt{0.267 + 1.25} = \sqrt{1.517} = 1.232$$
T-statistic: $$t = \frac{10 - 12}{1.232} = \frac{-2}{1.232} = -1.624$$
df (Welch-Satterthwaite) ≈ 24
Critical values (two-tailed, df ≈ 24, α = 0.05): t = ±2.064
Since |t| = 1.624 < 2.064, fail to reject H₀
Conclusion: Insufficient evidence that recovery times differ (p ≈ 0.118).
Section 4: Checking Assumptions
Normality Testing
Q-Q Plot Method
- Plot sample quantiles vs theoretical normal quantiles
- Points close to diagonal line suggest normality
- Deviations at tails indicate non-normality
Shapiro-Wilk Test
- Null hypothesis: Data are normally distributed
- If p > 0.05: Assume normality
- If p ≤ 0.05: Data may not be normal
Solution if Not Normal
- Large sample (n ≥ 30)? Central Limit Theorem applies; t-test is robust
- Small sample? Consider transforming data or using non-parametric test (Mann-Whitney U)
Homogeneity of Variance (Levene’s Test)
For two-sample tests:
- H₀: Variances are equal
- H₁: Variances are not equal
Decision:
- p > 0.05: Assume equal variances → Use standard t-test
- p ≤ 0.05: Use Welch’s t-test
Independence Verification
Checklist:
- ✓ Random sampling used?
- ✓ Samples from different groups/time periods?
- ✓ Observations within sample are independent?
- ✓ No repeated measures for same subject?
Section 5: Effect Size for T-Tests
Cohen’s d
Standardized measure of difference between means:
$$d = \frac{\overline{x}_1 - \overline{x}_2}{s_p}$$
For one-sample test: $$d = \frac{\overline{x} - \mu_0}{s}$$
Interpretation (Cohen’s Guidelines)
| Effect Size | Interpretation |
|---|---|
| |d| ≈ 0.2 | Small effect |
| |d| ≈ 0.5 | Medium effect |
| |d| ≈ 0.8 | Large effect |
| |d| > 1.0 | Very large effect |
Example: Cohen’s d
From teaching methods example: $$d = \frac{78 - 72}{8.255} = \frac{6}{8.255} = 0.727$$
This is a medium to large effect size, indicating the difference is practically meaningful.
Reporting
Always report alongside p-value:
- “t(48) = 2.569, p = 0.013, d = 0.73”
This shows the difference is statistically significant AND practically meaningful.
Section 6: Confidence Intervals
One-Sample T-Test CI
$$\text{CI} = \overline{x} \pm t_{\alpha/2} \times \frac{s}{\sqrt{n}}$$
Two-Sample T-Test CI (Equal Variances)
$$\text{CI} = (\overline{x}_1 - \overline{x}2) \pm t{\alpha/2} \times s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}$$
Two-Sample T-Test CI (Unequal Variances)
$$\text{CI} = (\overline{x}_1 - \overline{x}2) \pm t{\alpha/2} \times \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$$
Paired T-Test CI
$$\text{CI} = \overline{d} \pm t_{\alpha/2} \times \frac{s_d}{\sqrt{n}}$$
Interpretation
A 95% confidence interval means: “If we repeated this study 100 times, approximately 95 of the resulting confidence intervals would contain the true population parameter.”
Section 7: Common Applications
Education: Test Score Comparison
- Compare average scores between teaching methods
- Assess whether tutoring improves performance
Medicine: Treatment Efficacy
- Compare blood pressure before/after treatment
- Compare recovery time between two treatments
Quality Control: Manufacturing
- Test if machine produces specified dimensions
- Compare quality between suppliers
Psychology: Behavioral Interventions
- Measure anxiety scores before/after therapy
- Compare depression levels between treatment groups
Section 8: T-Test Decision Tree
Do you have one or two samples?
│
├─→ ONE SAMPLE
│ │
│ └─→ Compare sample mean to population value?
│ └─→ ONE-SAMPLE T-TEST
│ (Use when σ unknown)
│
└─→ TWO SAMPLES
│
├─→ Are samples paired/matched?
│ ├─→ YES: PAIRED T-TEST
│ │
│ └─→ NO: Proceed
│
└─→ Are variances equal?
├─→ YES or SIMILAR:
│ Standard two-sample t-test
│ (Assumes equal variances)
│
└─→ NO or UNEQUAL:
Welch's t-test
(Does not assume equal variances)
Section 9: Reporting T-Test Results
Standard Format
“Participants in the treatment group (M = 85.2, SD = 12.4) scored significantly higher than the control group (M = 78.6, SD = 13.1), t(58) = 2.34, p = 0.022, d = 0.61.”
Components
- Descriptive statistics: M (mean), SD (standard deviation)
- Test name: t-test (or Welch’s, paired)
- Degrees of freedom: (df)
- Test statistic: t = value
- P-value: p = value
- Effect size: d = value
APA Format
Two-sample t-test (assuming equal variances)
t(48) = 2.569, p = 0.013, d = 0.73
95% CI [1.26, 10.74]
Common Mistakes and How to Avoid Them
Mistake 1: Using T-Test Instead of Paired T-Test
Problem: Treating paired data as independent Solution: Check if observations are paired or matched before selecting test
Mistake 2: Ignoring Unequal Variances
Problem: Using standard t-test with very different sample variances Solution: Use Levene’s test; apply Welch’s test if variances unequal
Mistake 3: Multiple Comparisons Without Correction
Problem: Conducting multiple t-tests without adjusting significance level Solution: Use Bonferroni correction or ANOVA for 3+ groups
Mistake 4: Assuming Normality Not Tested
Problem: Skipping normality assessment for small samples Solution: Always test or visualize normality, especially n < 30
Mistake 5: Misinterpreting Insignificant Results
Problem: Concluding “no difference exists” from p > 0.05 Solution: State “insufficient evidence” and consider statistical power
Interactive T-Test Calculator
[Calculator would be embedded here with tools for:]
- One-sample t-test
- Paired t-test
- Two-sample t-test (equal variances)
- Welch’s t-test (unequal variances)
- Critical value finder
- P-value calculator
- Confidence interval calculator
Summary Comparison Table
| Aspect | One-Sample | Paired | Two-Sample (Equal σ) | Two-Sample (Unequal σ) |
|---|---|---|---|---|
| Samples | 1 | 2 (paired) | 2 (independent) | 2 (independent) |
| Test Statistic | $(x̄ - μ₀)/(s/√n)$ | $(d̄)/(s_d/√n)$ | Pooled variance formula | Separate variance formula |
| df | n - 1 | n - 1 | $n_1 + n_2 - 2$ | Welch-Satterthwaite |
| Assumption | - | Differences normal | Both σ equal | - |
| Use Levene’s? | No | No | Yes | Yes |
Key Formulas Cheat Sheet
One-Sample T-Test
$$t = \frac{\overline{x} - \mu_0}{s/\sqrt{n}}, \quad df = n-1$$
Paired T-Test
$$t = \frac{\overline{d}}{s_d/\sqrt{n}}, \quad df = n-1$$
Two-Sample T-Test (Equal Variances)
$$t = \frac{\overline{x}_1 - \overline{x}_2}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}, \quad s_p^2 = \frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}$$
Two-Sample T-Test (Unequal Variances - Welch’s)
$$t = \frac{\overline{x}_1 - \overline{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$$
Cohen’s d
$$d = \frac{\overline{x}_1 - \overline{x}_2}{s_p} \text{ or } d = \frac{\overline{x} - \mu_0}{s}$$
Confidence Interval
$$\text{CI} = \text{(mean difference)} \pm t_{\alpha/2} \times SE$$
Related Tests
Alternative Comparison Tests:
- Z-Tests Complete Guide - For large samples with known population SD
- ANOVA - For comparing 3+ group means
- Non-Parametric Tests - For non-normal data
Foundational Concepts:
- Hypothesis Testing Guide - Core concepts and principles
- P-Values - Understanding significance
- Type I and Type II Errors - Statistical errors explained
Advanced Topics:
- ANOVA - Multiple group comparisons
Related Resources
- Z-Tests Comprehensive Guide
- Hypothesis Testing Fundamentals
- Normal Distribution & Standard Normal
- Statistical Significance & P-Values
- Effect Sizes & Statistical Power
Next Steps
After mastering t-tests:
- ANOVA: Compare 3+ groups simultaneously
- Chi-Square Tests: Test categorical data
- Effect Sizes & Power Analysis: Understand practical significance and plan studies
- Non-Parametric Alternatives: When assumptions are violated
References
-
Anderson, D.R., Sweeney, D.J., & Williams, T.A. (2018). Statistics for Business and Economics (14th ed.). Cengage Learning. - Detailed coverage of t-tests, assumptions, and practical applications in business analysis.
-
Montgomery, D.C., & Runger, G.C. (2018). Applied Statistics for Engineers and Scientists (6th ed.). John Wiley & Sons. - Applications of t-tests in engineering and scientific experiments, including one-sample, paired, and two-sample designs.
-
Walpole, R.E., Myers, S.L., Myers, S.L., & Ye, K. (2012). Probability & Statistics for Engineers & Scientists (9th ed.). Pearson. - Theoretical foundation and properties of the t-distribution and student’s t-tests.