PostgreSQL: Select Every Nth Row
When you’re working with large datasets, sometimes you need to sample data—taking every 5th row, every 10th row, or every nth row for analysis, testing, or reporting. Maybe you’re selecting 10% of transactions for audit review, sampling customer records for surveys, or picking key points from time series data for visualization. It’s more efficient than loading all data and filtering in application code.
PostgreSQL makes this straightforward with window functions and modulo arithmetic. You assign row numbers, then use modulo to find every nth row. Once you understand this pattern, you’ll use it for all kinds of sampling scenarios.
The Problem
You have 30 employees and want to select every 5th row (rows 5, 10, 15, 20, 25, 30).
Solution 1: Using MOD with Row Number
SELECT
emp_id,
first_name,
last_name,
salary
FROM (
SELECT
emp_id,
first_name,
last_name,
salary,
ROW_NUMBER() OVER (ORDER BY emp_id) as rn
FROM employees
) subquery
WHERE MOD(rn, 5) = 0
ORDER BY emp_id;
Result:
emp_id | first_name | last_name | salary
-------+------------+-----------+--------
5 | Michael | Davis | 70000
10 | Charles | Anderson | 92000
15 | Thomas | Harris | 82000
20 | Christopher| Lewis | 78000
25 | Nancy | Walker | 75000
30 | Unknown | Unknown | 72000
How It Works
MOD(rn, 5) = 0 means: rows where the row number is evenly divisible by 5 (remainder of 0). This selects rows 5, 10, 15, 20, 25, 30.
Change the divisor to sample differently:
- MOD(rn, 2) = 0 → every 2nd row (all even-numbered rows)
- MOD(rn, 3) = 0 → every 3rd row
- MOD(rn, 10) = 0 → every 10th row
Solution 2: Using FILTER for Different Starting Points
Select every nth row starting from a specific offset:
-- Select every 5th row starting from row 1 (1, 6, 11, 16, 21, 26...)
SELECT
emp_id,
first_name,
salary
FROM (
SELECT
emp_id,
first_name,
salary,
ROW_NUMBER() OVER (ORDER BY emp_id) as rn
FROM employees
) subquery
WHERE MOD(rn - 1, 5) = 0
ORDER BY emp_id;
MOD(rn - 1, 5) = 0 shifts the pattern by 1, selecting rows 1, 6, 11, 16, 21, 26.
Solution 3: Sampling with LIMIT and OFFSET
Sample approximately every nth row without gaps:
-- Get 6 rows evenly distributed from 30 total rows
SELECT
emp_id,
first_name,
salary
FROM employees
ORDER BY RANDOM()
LIMIT 6;
This randomly samples 6 employees (approximately every 5th row). Use RANDOM() for statistical sampling.
Real-World Example: Department Sampling
Select every 3rd employee from the Engineering department:
SELECT
emp_id,
first_name,
last_name,
salary,
dept_name
FROM (
SELECT
e.emp_id,
e.first_name,
e.last_name,
e.salary,
d.dept_name,
ROW_NUMBER() OVER (PARTITION BY d.dept_name ORDER BY e.emp_id) as dept_rn
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE d.dept_name = 'Engineering'
) subquery
WHERE MOD(dept_rn, 3) = 0
ORDER BY emp_id;
PARTITION BY dept_name resets the row count for each department, so “every 3rd” applies per department, not globally.
Performance Tips
- Use ROW_NUMBER() for consistent ordering
- Add INDEX on ORDER BY column for faster row numbering
- Use PARTITION BY to sample within groups
- MOD is fast - minimal performance overhead
Common Variations
-- Every nth row starting from row 1
WHERE MOD(rn - 1, n) = 0
-- Every nth row starting from row 0
WHERE MOD(rn, n) = 0
-- Every nth row in each group
WHERE MOD(rn, n) = 0 AND PARTITION BY group_column
FAQ
Q: Which method is fastest? A: MOD with ROW_NUMBER is fastest for large datasets.
Q: Can I sample across multiple groups? A: Yes, use PARTITION BY in the OVER clause to reset counting per group.
Q: What if I want random sampling? A: Use ORDER BY RANDOM() LIMIT n instead of MOD.
Related Articles
- PostgreSQL Row Selection - Master query filtering
- Window Functions - ROW_NUMBER and advanced techniques
- GROUP BY and Aggregation - Sampling within groups