PostgreSQL: Remove Specific Characters from String
Remove specific unwanted characters like dashes, brackets, or special symbols from your strings.
Quick Answer: Use REGEXP_REPLACE
To remove all occurrences of specific characters:
SELECT
phone,
REGEXP_REPLACE(phone, '[()-]', '', 'g') as digits_only,
REGEXP_REPLACE(phone, '[^0-9]', '', 'g') as phone_clean
FROM employees
WHERE phone IS NOT NULL
LIMIT 5;
Returns:
phone | digits_only | phone_clean
---------------+-------------+-------------
+1-(555)-1234 | +15551234 | 5551234
(555) 123-4567 | 5551234567 | 5551234567
The [()-] matches any of those characters. The 'g' flag means global (all occurrences).
How It Works
REGEXP_REPLACE uses regex patterns to find and remove characters. The pattern [()-] means “any character in this set.” The empty string as replacement effectively deletes them. The 'g' flag ensures ALL occurrences are removed, not just the first.
Different Approaches & When to Use Them
Approach 1: Specific Characters (Best for Known Set)
SELECT
text,
REGEXP_REPLACE(text, '[{}()\[\]]', '', 'g') as remove_brackets
FROM strings;
Good when you know exactly what to remove.
Best for: Cleaning specific known characters.
Approach 2: Keep Only What You Want
SELECT
text,
REGEXP_REPLACE(text, '[^a-zA-Z0-9]', '', 'g') as alphanumeric_only,
REGEXP_REPLACE(text, '[^0-9]', '', 'g') as digits_only
FROM strings;
Using negated character class [^...] to keep only specific types.
Best for: Extract digits or alphanumeric characters.
Approach 3: Chained REPLACE Calls
SELECT
text,
REPLACE(REPLACE(REPLACE(text, '(', ''), ')', ''), '-', '') as no_symbols
FROM strings;
Simple for a few fixed characters.
Best for: Just 1-3 characters to remove.
Approach 4: Remove Whitespace
SELECT
text,
REGEXP_REPLACE(text, '\s+', '', 'g') as no_spaces,
REGEXP_REPLACE(text, '^\s+|\s+$', '', 'g') as trim_edges_only
FROM strings;
\s+ matches any whitespace (spaces, tabs, newlines).
Approach 5: Remove Duplicate Characters
SELECT
text,
REGEXP_REPLACE(text, '(.)\1+', '\1', 'g') as no_duplicates
FROM strings;
Removes runs of duplicate characters.
Best for: Cleaning up “sstttuufffering” → “stufering”.
Real-World Example: Normalize Phone Numbers
Extract just digits from various phone formats:
SELECT
customer_id,
name,
phone,
REGEXP_REPLACE(phone, '[^0-9]', '', 'g') as digits_only,
CONCAT(
SUBSTRING(REGEXP_REPLACE(phone, '[^0-9]', '', 'g'), 1, 3),
'-',
SUBSTRING(REGEXP_REPLACE(phone, '[^0-9]', '', 'g'), 4, 3),
'-',
SUBSTRING(REGEXP_REPLACE(phone, '[^0-9]', '', 'g'), 7)
) as normalized
FROM customers
WHERE phone IS NOT NULL;
Update Columns
Clean data permanently:
UPDATE customers
SET phone = REGEXP_REPLACE(phone, '[^0-9+]', '', 'g')
WHERE phone IS NOT NULL;
SELECT customer_id, phone FROM customers LIMIT 5;
Performance Notes
| Method | Speed | Best For |
|---|---|---|
| REGEXP_REPLACE | Fast | Pattern-based removal |
| Chained REPLACE | Very Fast | Few specific chars |
| Negated pattern | Fast | Keep specific types |
REGEXP_REPLACE is highly optimized.
Common Patterns
-- Remove all punctuation
REGEXP_REPLACE(text, '[^\w\s]', '', 'g')
-- Remove all numbers
REGEXP_REPLACE(text, '[0-9]', '', 'g')
-- Remove all letters
REGEXP_REPLACE(text, '[a-zA-Z]', '', 'g')
-- Remove non-printable characters
REGEXP_REPLACE(text, '[\x00-\x1F]', '', 'g')
-- Remove leading/trailing slashes
REGEXP_REPLACE(text, '^/|/$', '', 'g')
-- Remove HTML tags
REGEXP_REPLACE(text, '<[^>]+>', '', 'g')
-- Keep only word characters
REGEXP_REPLACE(text, '[^\w]', '', 'g')
FAQ
Q: What’s the difference between REPLACE and REGEXP_REPLACE? A: REPLACE matches exact strings. REGEXP_REPLACE uses regex patterns. Use REPLACE for simple cases, REGEXP_REPLACE for complex patterns.
Q: How do I escape special regex characters?
A: Use backslash: \. for literal period, \( for literal parenthesis.
Q: Can I remove case-insensitive?
A: Yes, add ‘i’ flag: REGEXP_REPLACE(text, 'pattern', '', 'gi').
Q: What does the ‘g’ flag mean? A: Global—replace all occurrences. Without it, only the first match is replaced.
Q: Performance for very large strings? A: Still fast. PostgreSQL optimizes REGEXP_REPLACE well. No concerns unless processing millions of rows.
Q: How do I keep only digits?
A: REGEXP_REPLACE(text, '[^0-9]', '', 'g') keeps only 0-9.
Wrapping Up
Remove specific characters using REGEXP_REPLACE with patterns. For known character sets, use simple patterns like [()-]. For categories (digits, letters, whitespace), use character classes like [^0-9].
The key: patterns are powerful and flexible. Master the common ones and you’ll clean any data format.
Related Articles
- Remove Substring - Remove text patterns
- Remove Last Character - Simple character removal
- PostgreSQL String Functions - Complete string guide