Introduction to Z-Tests
A z-test is a parametric statistical test used to test hypotheses about population means (when population variance is known) or proportions. Z-tests are based on the standard normal distribution and are commonly used when:
- Sample size is large (n ≥ 30)
- Population standard deviation is known
- You’re testing a population proportion
- The population is approximately normally distributed
Z-tests are one of the most fundamental tools in hypothesis testing and serve as the foundation for understanding more complex statistical tests.
When to Use Z-Tests
Conditions for Using Z-Tests
Z-tests are appropriate when:
- Large Sample Size: n ≥ 30 (Central Limit Theorem applies)
- Known Population Variance: σ is known (rare in practice)
- Normal Distribution: Population is normally distributed or sample is large enough
- Independence: Observations are independent
- Single or Two Samples: Not for multiple comparisons (use ANOVA instead)
Comparison with T-Tests
| Feature | Z-Test | T-Test |
|---|---|---|
| Sample Size | Large (n ≥ 30) | Small to large |
| Population σ | Known | Unknown |
| Distribution | Standard normal | Student’s t |
| Degrees of freedom | None | n - 1 |
| Robustness | Less robust for small samples | More robust |
Section 1: One-Sample Z-Test
Purpose and Hypothesis
The one-sample z-test tests whether a sample mean differs significantly from a hypothesized population mean.
Hypotheses:
- H₀: μ = μ₀ (Null hypothesis: population mean equals hypothesized value)
- H₁: μ ≠ μ₀ (Two-tailed alternative: population mean differs)
- H₁: μ > μ₀ (Right-tailed alternative)
- H₁: μ < μ₀ (Left-tailed alternative)
Formula
$$z = \frac{\overline{x} - \mu_0}{\sigma/\sqrt{n}}$$
Where:
- $\overline{x}$ = Sample mean
- μ₀ = Hypothesized population mean
- σ = Population standard deviation
- n = Sample size
Step-by-Step Procedure
Step 1: State Hypotheses
- Define null and alternative hypotheses
- Specify significance level (α, typically 0.05)
Step 2: Calculate Test Statistic
- Compute sample mean: $\overline{x}$
- Apply z-test formula
Step 3: Find P-value
- For two-tailed test: P-value = 2 × P(Z > |z|)
- For right-tailed test: P-value = P(Z > z)
- For left-tailed test: P-value = P(Z < z)
Step 4: Decision
- If P-value ≤ α: Reject H₀
- If P-value > α: Fail to reject H₀
Example: One-Sample Z-Test for Mean
Problem: A cereal manufacturer claims their boxes contain 368g on average. A random sample of 36 boxes has mean weight of 364.5g. The population standard deviation is known to be 5g. Test at α = 0.05 whether the mean weight differs from the claimed value.
Solution:
Given:
- n = 36, $\overline{x}$ = 364.5g, μ₀ = 368g, σ = 5g, α = 0.05
Calculate z-statistic: $$z = \frac{364.5 - 368}{5/\sqrt{36}} = \frac{-3.5}{0.833} = -4.20$$
For two-tailed test with α = 0.05:
- Critical values: z = ±1.96
- Since |z| = 4.20 > 1.96, reject H₀
- P-value ≈ 0.00003 < 0.05
Conclusion: There is strong evidence that the mean weight differs from 368g.
Section 2: Two-Sample Z-Test for Means
Purpose and Hypothesis
The two-sample z-test compares means between two independent populations.
Hypotheses:
- H₀: μ₁ = μ₂ (Population means are equal)
- H₁: μ₁ ≠ μ₂ (Population means differ)
Assumptions
- Both populations normally distributed (or large samples)
- Both population standard deviations known
- Samples are independent
- Equal sample sizes preferred (but not required)
Formula
$$z = \frac{(\overline{x}_1 - \overline{x}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$$
Where:
- $\overline{x}_1$, $\overline{x}_2$ = Sample means
- σ₁, σ₂ = Population standard deviations
- n₁, n₂ = Sample sizes
- Under H₀: (μ₁ - μ₂) = 0
Simplified under H₀: $$z = \frac{\overline{x}_1 - \overline{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$$
Step-by-Step Procedure
Step 1: State Hypotheses
- Null: μ₁ = μ₂
- Alternative: μ₁ ≠ μ₂ (or one-tailed)
- Significance level: α = 0.05
Step 2: Check Assumptions
- Verify independence and normality
- Both σ’s are known
Step 3: Calculate Test Statistic
- Compute both sample means
- Apply two-sample z formula
Step 4: Find P-value and Conclude
- Compare to critical value or p-value
- Make statistical decision
Example: Two-Sample Z-Test
Problem: Brand A energy drink claims to have more caffeine than Brand B. A sample of 50 Brand A drinks has mean caffeine 85mg (σ = 8mg). A sample of 50 Brand B drinks has mean caffeine 80mg (σ = 7mg). Test at α = 0.05 whether Brand A has more caffeine (right-tailed test).
Solution:
Given:
- n₁ = 50, $\overline{x}_1$ = 85mg, σ₁ = 8mg
- n₂ = 50, $\overline{x}_2$ = 80mg, σ₂ = 7mg
- α = 0.05 (right-tailed)
Standard error: $$SE = \sqrt{\frac{64}{50} + \frac{49}{50}} = \sqrt{1.28 + 0.98} = \sqrt{2.26} = 1.503$$
Z-statistic: $$z = \frac{85 - 80}{1.503} = \frac{5}{1.503} = 3.33$$
Critical value (right-tailed, α = 0.05): z = 1.645
Since z = 3.33 > 1.645, reject H₀
Conclusion: Brand A has significantly more caffeine than Brand B (p < 0.001).
Section 3: Z-Test for Proportions
One-Sample Proportion Test
Purpose
Tests whether a population proportion differs from a hypothesized value.
Hypotheses:
- H₀: p = p₀
- H₁: p ≠ p₀ (or one-tailed)
Requirements
- np₀ ≥ 5
- n(1 - p₀) ≥ 5
- Random sample
Formula
$$z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$
Where:
- $\hat{p}$ = Sample proportion (x/n)
- p₀ = Hypothesized population proportion
- n = Sample size
Example: One-Sample Proportion Test
Problem: A politician claims 50% of voters support their policy. In a sample of 200 voters, 115 support it. Test at α = 0.05 whether the true proportion differs from 50%.
Solution:
Given:
- n = 200, x = 115, $\hat{p}$ = 0.575, p₀ = 0.50, α = 0.05
Check requirements:
- np₀ = 200 × 0.50 = 100 ≥ 5 ✓
- n(1-p₀) = 200 × 0.50 = 100 ≥ 5 ✓
Standard error: $$SE = \sqrt{\frac{0.50 × 0.50}{200}} = \sqrt{0.00125} = 0.0354$$
Z-statistic: $$z = \frac{0.575 - 0.50}{0.0354} = \frac{0.075}{0.0354} = 2.12$$
Critical values (two-tailed, α = 0.05): z = ±1.96
Since |z| = 2.12 > 1.96, reject H₀
Conclusion: The proportion supporting the policy is significantly different from 50% (p = 0.034).
Two-Sample Proportion Test
Purpose
Compares proportions between two independent populations.
Hypotheses:
- H₀: p₁ = p₂
- H₁: p₁ ≠ p₂
Formula
$$z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$
Where: $$\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$$ (Pooled proportion)
Example: Two-Sample Proportion Test
Problem: Test whether treatment and control groups have different success rates.
- Treatment: 80 successes out of 150
- Control: 60 successes out of 150
- α = 0.05
Solution:
Given:
- $\hat{p}_1$ = 80/150 = 0.533, $\hat{p}_2$ = 60/150 = 0.400
Pooled proportion: $$\hat{p} = \frac{80 + 60}{150 + 150} = \frac{140}{300} = 0.467$$
Standard error: $$SE = \sqrt{0.467 × 0.533 × \left(\frac{1}{150} + \frac{1}{150}\right)} = \sqrt{0.249 × 0.0133} = 0.0577$$
Z-statistic: $$z = \frac{0.533 - 0.400}{0.0577} = \frac{0.133}{0.0577} = 2.30$$
Critical values (two-tailed): z = ±1.96
Since |z| = 2.30 > 1.96, reject H₀
Conclusion: The treatment group has a significantly higher success rate (p = 0.021).
Section 4: Critical Values and Decision Rules
Understanding Z Critical Values
Critical value is the threshold that separates rejection and acceptance regions of the null hypothesis.
Common Critical Values
| Significance Level | One-Tailed | Two-Tailed |
|---|---|---|
| α = 0.10 | ±1.28 | ±1.645 |
| α = 0.05 | ±1.645 | ±1.96 |
| α = 0.01 | ±2.33 | ±2.576 |
Finding Critical Values
For One-Tailed Tests:
- Right-tailed: Find z such that P(Z > z) = α
- Left-tailed: Find z such that P(Z < z) = α
For Two-Tailed Tests:
- Find z such that P(Z > z) = α/2 (both tails)
- Critical values are ±z
Using Z-Score Tables
Standard normal tables provide cumulative probabilities P(Z ≤ z). To find critical values:
Example: Find critical value for α = 0.05 (two-tailed)
- α/2 = 0.025
- Find where P(Z ≤ z) = 1 - 0.025 = 0.975
- This corresponds to z ≈ 1.96
Section 5: P-Values and Interpretation
What is a P-Value?
The p-value is the probability of observing test results as extreme or more extreme than what was actually observed, assuming the null hypothesis is true.
Interpretation:
- P-value ≤ α: Reject H₀ (Results are statistically significant)
- P-value > α: Fail to reject H₀ (Results are not statistically significant)
Calculating P-Values for Z-Tests
Two-Tailed Test: $$\text{P-value} = 2 × P(Z > |z_{calculated}|)$$
Right-Tailed Test: $$\text{P-value} = P(Z > z_{calculated})$$
Left-Tailed Test: $$\text{P-value} = P(Z < z_{calculated})$$
Example P-Value Calculation
If z = 2.5 (two-tailed test):
- P(Z > 2.5) ≈ 0.00621
- P-value = 2 × 0.00621 = 0.01242
- At α = 0.05: Reject H₀
Common Misconceptions
❌ Incorrect: P-value is the probability that H₀ is true ✓ Correct: P-value is the probability of data given H₀ is true
❌ Incorrect: P-value = 0.08 means “almost significant” ✓ Correct: Use the predetermined α level (e.g., 0.05) consistently
Section 6: Effect Size for Z-Tests
Why Report Effect Size?
Statistical significance (small p-value) doesn’t indicate practical importance. Effect size measures the magnitude of the difference.
Cohen’s d
For comparing means, Cohen’s d quantifies the standardized difference:
$$d = \frac{\overline{x} - \mu_0}{\sigma}$$
Interpretation (Cohen’s Guidelines):
- |d| ≈ 0.2: Small effect
- |d| ≈ 0.5: Medium effect
- |d| ≈ 0.8: Large effect
Example: Effect Size
From earlier example (cereal weights): $$d = \frac{364.5 - 368}{5} = \frac{-3.5}{5} = -0.70$$
This is a medium to large effect size, indicating practical significance beyond statistical significance.
Section 7: Assumptions and Conditions
Checking Assumptions
Before conducting a z-test, verify:
-
Independence: Observations are independent
- Random sampling or random assignment
- Sample ≤ 10% of population (for finite populations)
-
Normality: Population is normally distributed OR sample is large (n ≥ 30)
- Check with Q-Q plot or Shapiro-Wilk test
- Central Limit Theorem applies for large samples
-
Known Variance: Population standard deviation is known
- Rarely true in practice; use t-test if unknown
What if Assumptions are Violated?
- Non-normal data + small sample: Use t-test or non-parametric alternative (Mann-Whitney U)
- Unknown variance: Use t-test
- Non-independent observations: Use paired t-test or other designs
Section 8: Practical Applications
Business: Quality Control
Scenario: A manufacturer wants to verify if a production line produces parts with correct dimensions (120mm target).
- Sample: 64 parts, mean = 119.8mm, σ = 1.2mm
- H₀: μ = 120
- H₁: μ ≠ 120
- Conclusion: Guide production adjustments
Medicine: Clinical Trials
Scenario: Testing if a new medication reduces blood pressure differently than placebo.
- Treatment group: n₁ = 100, mean BP reduction = 12mmHg
- Placebo group: n₂ = 100, mean BP reduction = 8mmHg
- Conclusion: Determine drug efficacy
Marketing: Customer Satisfaction
Scenario: Testing if customer satisfaction improved after service changes.
- Before: 65% satisfied
- After: 72% satisfied (n = 500)
- Conclusion: Assess impact of changes
Section 9: Common Mistakes and Pitfalls
Mistake 1: P-Hacking
Problem: Running multiple tests and reporting only significant results Solution: Pre-specify hypotheses and significance level before testing
Mistake 2: Ignoring Effect Size
Problem: Focusing only on p-value without considering practical significance Solution: Always report effect size (Cohen’s d) alongside test results
Mistake 3: Using Z-Test with Unknown Variance
Problem: Using z-test when population σ is unknown Solution: Use t-test for unknown variance (more appropriate)
Mistake 4: Violating Independence Assumption
Problem: Using z-test on paired or dependent data Solution: Use paired t-test for related samples
Mistake 5: Misinterpreting Non-Significance
Problem: Concluding “no difference exists” from p > 0.05 Solution: State “insufficient evidence” and consider sample size
Interactive Z-Test Calculator
[Calculator would be embedded here with examples for:]
- One-sample z-test for means
- Two-sample z-test for means
- One-sample proportion test
- Two-sample proportion test
- Critical value finder
- P-value calculator
Summary Table: Z-Test Selection
| Scenario | Test Type | Formula Key | Sample Size |
|---|---|---|---|
| One mean vs. target | One-sample z | z = (x̄ - μ)/SE | n ≥ 30 |
| Two means comparison | Two-sample z | z = (x̄₁ - x̄₂)/SE | n₁, n₂ ≥ 30 |
| One proportion vs. target | One-prop z | z = (p̂ - p₀)/SE | np₀ ≥ 5 |
| Two proportions comparison | Two-prop z | z = (p̂₁ - p̂₂)/SE | Both ≥ 5 |
Key Formulas Cheat Sheet
One-Sample Z-Test
$$z = \frac{\overline{x} - \mu_0}{\sigma/\sqrt{n}}$$
Two-Sample Z-Test
$$z = \frac{\overline{x}_1 - \overline{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$$
One-Sample Proportion Test
$$z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$
Two-Sample Proportion Test
$$z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$
Standard Error (Two Samples)
$$SE = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}$$
Cohen’s d
$$d = \frac{\overline{x} - \mu_0}{\sigma}$$
Related Tests
Alternative Comparison Tests:
- T-Tests Complete Guide - For smaller samples or unknown population SD
- ANOVA - For comparing multiple group means
- Chi-Square Tests - For categorical data
Foundational Concepts:
- Hypothesis Testing Guide - Core concepts
- Z-Scores Utilities - Understanding z-score fundamentals
- P-Values - Interpreting significance
Related Distribution Concepts:
- Z-Score Tables - Lookup tables for probabilities
Related Resources
- Z-Score Utilities & Calculations
- Understanding Z-Scores: Theory and Applications
- Using Z-Score Tables
- T-Tests Comprehensive Guide
- Hypothesis Testing Fundamentals
- Probability Distributions: Normal Distribution
Next Steps
After mastering z-tests:
- T-Tests: Learn when population variance is unknown
- Effect Sizes & Power Analysis: Understand practical significance
- ANOVA: Compare more than two groups
- Non-Parametric Tests: Alternatives when assumptions are violated
References
-
Anderson, D.R., Sweeney, D.J., & Williams, T.A. (2018). Statistics for Business and Economics (14th ed.). Cengage Learning. - Comprehensive treatment of z-tests for means and proportions with real-world applications.
-
NIST/SEMATECH. (2023). e-Handbook of Statistical Methods. Retrieved from https://www.itl.nist.gov/div898/handbook/ - Statistical methods for hypothesis testing including z-tests and normal distribution theory.