PostgreSQL WHERE Clause: Advanced Filtering & Pattern Matching
The WHERE clause is where you ask specific questions of your data. Sure, you can filter by a single condition, but the real power emerges when you combine multiple conditions, use pattern matching, handle NULL values correctly, and build sophisticated logical structures. I’ve seen queries that pull exactly the right data in milliseconds, and others that do the same work but take minutes—the difference is almost always in the WHERE clause.
Mastering WHERE transforms you from someone who can extract data to someone who can ask nuanced, precise questions and get exactly what you need. It’s the difference between “show me all customers” and “show me active customers from California who spent more than $1000 last year and haven’t received marketing emails in the past month.”
What You’ll Learn
This guide dives deeper into WHERE than the SELECT complete guide:
- Complex pattern matching with LIKE and wildcards
- Handling NULL values correctly
- Building sophisticated AND/OR logic
- Using BETWEEN and IN effectively
- Common WHERE mistakes and how to avoid them
Review: Basic WHERE
A quick reminder—basic WHERE uses comparison operators:
SELECT first_name, salary FROM employees WHERE salary > 80000;
This returns employees earning more than $80,000. But WHERE can do much more.
Pattern Matching with LIKE
LIKE finds values matching a pattern. It’s powerful for searching text.
LIKE with % Wildcard
The % means “any characters” (zero or more).
Find first names starting with “J”:
SELECT first_name FROM employees WHERE first_name LIKE 'J%';
Result:
first_name
------------
James
Jennifer
Joseph
Jessica
(4 rows)
Find first names ending with “y”:
SELECT first_name FROM employees WHERE first_name LIKE '%y';
Result:
first_name
------------
Mary
Patricia
Ashley
(3 rows)
Find first names containing “ar”:
SELECT first_name FROM employees WHERE first_name LIKE '%ar%';
Result:
first_name
------------
Barbara
Margaret
(2 rows)
LIKE with _ Wildcard
The _ matches exactly one character.
Find 5-letter first names starting with “J”:
SELECT first_name FROM employees WHERE first_name LIKE 'J____';
Result:
first_name
------------
James
(1 row)
The four underscores represent exactly 4 characters after “J”.
ILIKE for Case-Insensitive Matching
LIKE is case-sensitive by default (on most systems). Use ILIKE for case-insensitive:
SELECT first_name FROM employees WHERE first_name ILIKE 'james';
Result (case-insensitive):
first_name
------------
James
(1 row)
Without ILIKE, this wouldn’t match “James” (capital J).
LIKE Performance Note
LIKE with % at the beginning is slow:
-- Slow: has to check every row
WHERE first_name LIKE '%ar%'
LIKE with % only at the end is faster:
-- Faster: can use an index
WHERE first_name LIKE 'J%'
For production queries, consider full-text search instead of LIKE. We cover this in advanced guides.
Handling NULL Values
NULL in SQL means “unknown” or “missing,” not zero or empty string.
IS NULL vs. = NULL
This is a critical point. NULL doesn’t equal anything, including itself:
-- This doesn't work (always returns no rows)
SELECT * FROM employees WHERE manager_id = NULL;
-- This works (returns rows where manager_id is NULL)
SELECT * FROM employees WHERE manager_id IS NULL;
IS NOT NULL
Find employees with a manager assigned:
SELECT first_name, manager_id FROM departments
WHERE manager_id IS NOT NULL;
Result:
manager_id
------------
1
7
15
21
(4 rows)
NULL in Comparisons
When you compare with NULL, the result is unknown:
SELECT NULL = NULL; -- Returns NULL, not true
SELECT NULL > 5; -- Returns NULL, not false
If a column contains NULL and you use WHERE column = value, those NULL rows are excluded automatically.
SELECT COUNT(*) FROM employees WHERE dept_id = 2;
-- Returns 8 (doesn't count rows where dept_id IS NULL)
If you want to include NULL rows, use OR:
SELECT COUNT(*) FROM employees WHERE dept_id = 2 OR dept_id IS NULL;
Complex AND/OR Logic
Real-world queries often have multiple conditions.
AND (All Conditions Must Be True)
Find senior engineers (engineering dept AND hired before 2020):
SELECT first_name, dept_id, hire_date FROM employees
WHERE dept_id = 2 AND EXTRACT(YEAR FROM hire_date) < 2020;
Result:
first_name | dept_id | hire_date
------------+---------+------------
David | 2 | 2018-06-11
Jennifer | 2 | 2019-08-05
(2 rows)
Both conditions must be true for a row to appear.
OR (At Least One Condition Must Be True)
Find employees earning over $100k OR in engineering:
SELECT first_name, salary, dept_id FROM employees
WHERE salary > 100000 OR dept_id = 2;
Result:
first_name | salary | dept_id
------------+---------+---------
David | 105000 | 2
Jennifer | 102000 | 2
Richard | 98000 | 2
Barbara | 95000 | 2
Charles | 92000 | 2
Susan | 99000 | 2
Joseph | 96000 | 2
(7 rows)
You get anyone who satisfies at least one condition.
Combining AND with OR
Be careful with complex logic. Parentheses clarify intent:
-- Without parentheses: ambiguous
SELECT * FROM employees
WHERE dept_id = 1 AND salary > 75000 OR dept_id = 2;
-- With parentheses: clear
SELECT * FROM employees
WHERE (dept_id = 1 AND salary > 75000) OR dept_id = 2;
The second version means: “(Sales employees earning over $75k) OR (all engineering employees)”.
Without parentheses, PostgreSQL interprets it the same way (AND has higher precedence than OR), but parentheses make your intent obvious.
Using BETWEEN for Ranges
BETWEEN is cleaner than multiple conditions:
-- Using BETWEEN
SELECT first_name, salary FROM employees
WHERE salary BETWEEN 70000 AND 80000;
-- Equivalent with AND
SELECT first_name, salary FROM employees
WHERE salary >= 70000 AND salary <= 80000;
Both return the same results:
first_name | salary
------------+----------
James | 75000.00
Mary | 78000.00
Robert | 72000.00
Patricia | 76000.00
Linda | 73000.00
(5 rows)
BETWEEN is inclusive on both ends (includes 70000 and 80000).
BETWEEN with Dates
Find employees hired in 2021:
SELECT first_name, hire_date FROM employees
WHERE hire_date BETWEEN '2021-01-01' AND '2021-12-31';
Result:
first_name | hire_date
------------+------------
Robert | 2021-05-10
Richard | 2020-11-17
Barbara | 2021-01-25
Anthony | 2021-02-28
Daniel | 2021-10-05
Christopher| 2021-07-19
(6 rows)
Using IN for Multiple Values
IN is cleaner than multiple OR conditions:
-- Using IN
SELECT first_name, dept_id FROM employees
WHERE dept_id IN (1, 2);
-- Equivalent with OR
SELECT first_name, dept_id FROM employees
WHERE dept_id = 1 OR dept_id = 2;
Both return 14 rows (all Sales and Engineering employees).
IN with Subqueries
You can also use IN with a subquery:
SELECT first_name FROM employees
WHERE dept_id IN (SELECT dept_id FROM departments WHERE location = 'San Francisco');
This finds employees in departments located in San Francisco.
NOT IN
Find employees NOT in departments 1 or 2:
SELECT first_name, dept_id FROM employees
WHERE dept_id NOT IN (1, 2);
Result:
first_name | dept_id
------------+---------
Thomas | 3
Karen | 3
Christopher| 3
Nancy | 3
Daniel | 3
Lisa | 3
Matthew | 4
Betty | 4
Anthony | 4
Margaret | 4
Mark | 4
Sandra | 4
Steven | 4
Ashley | 4
Paul | 4
(15 rows)
Common Errors & Solutions
Error: “You must use single quotes for strings”
Problem:
WHERE first_name = "James"
Double quotes are for identifiers (column names), single quotes for values.
Solution:
WHERE first_name = 'James'
Error: “LIKE requires %”
Problem:
WHERE first_name LIKE 'J'
This searches for exactly “J”, not names starting with J.
Solution:
WHERE first_name LIKE 'J%'
Error: “IN requires values in parentheses”
Problem:
WHERE dept_id IN 1, 2, 3
Solution:
WHERE dept_id IN (1, 2, 3)
Error: “NULL appears in NOT IN results”
Problem:
SELECT * FROM employees WHERE dept_id NOT IN (1, 2);
If any value in the IN list is NULL, NOT IN returns no rows (because NULL makes the comparison unknown).
Solution:
Use NOT IN with explicit NULL handling, or use a subquery with NOT IN… IS NOT NULL:
SELECT * FROM employees
WHERE dept_id NOT IN (1, 2) AND dept_id IS NOT NULL;
Performance Tips
1. Put Conditions on Indexed Columns
If dept_id has an index, use it:
-- Faster: uses index
WHERE dept_id = 2 AND salary > 80000
-- Slower: salary may not be indexed
WHERE salary > 80000 AND dept_id = 2
Actually, PostgreSQL’s query planner reorders conditions, so both are equally fast. But it’s good practice to think about indexes.
2. Use BETWEEN Instead of OR for Ranges
-- Better
WHERE salary BETWEEN 70000 AND 80000
-- Slower (potentially uses less efficient plans)
WHERE salary >= 70000 OR salary <= 80000
3. Avoid LIKE with % at the Start
-- Slow: full table scan
WHERE first_name LIKE '%ar%'
-- Faster: can use index
WHERE first_name LIKE 'J%'
4. Use Exact Matches When Possible
-- Fastest
WHERE dept_id = 2
-- Slower
WHERE dept_id IN (1, 2, 3, 4)
FAQ
Q: What’s the difference between LIKE and ILIKE? A: LIKE is case-sensitive; ILIKE is case-insensitive. ‘james’ LIKE ‘JAMES’ is false. ‘james’ ILIKE ‘JAMES’ is true.
Q: Can I use AND and OR together? A: Yes, but use parentheses for clarity:
WHERE (dept_id = 1 OR dept_id = 2) AND salary > 75000
Q: What does NULL AND FALSE return? A: NULL. In SQL, unknown (NULL) AND false is unknown, not false.
Q: Is BETWEEN inclusive? A: Yes, on both ends. BETWEEN 1 AND 10 includes 1 and 10.
Q: Can I use LIKE with numbers? A: Yes, but it treats them as text:
WHERE salary::TEXT LIKE '100%'
This finds salaries starting with 100 (100000, 100500, etc.).
Q: Why doesn’t WHERE column = NULL work?
A: Because NULL represents an unknown value. Unknown can’t equal anything. Use IS NULL.
Next Steps
You now understand WHERE clause filtering deeply. The next logical steps:
- DISTINCT and GROUP BY Explained - Remove duplicates and summarize data
- PostgreSQL JOINs Complete Guide - Combine data from multiple tables
- Subqueries in SELECT - Nest queries for complex logic
The WHERE clause is the foundation of good SQL. Master it, and everything else becomes easier!