PostgreSQL: Remove Specific Characters from String

Remove specific unwanted characters like dashes, brackets, or special symbols from your strings.

Quick Answer: Use REGEXP_REPLACE

To remove all occurrences of specific characters:

SELECT
    phone,
    REGEXP_REPLACE(phone, '[()-]', '', 'g') as digits_only,
    REGEXP_REPLACE(phone, '[^0-9]', '', 'g') as phone_clean
FROM employees
WHERE phone IS NOT NULL
LIMIT 5;

Returns:

phone          | digits_only | phone_clean
---------------+-------------+-------------
+1-(555)-1234  | +15551234   | 5551234
(555) 123-4567 | 5551234567  | 5551234567

The [()-] matches any of those characters. The 'g' flag means global (all occurrences).

How It Works

REGEXP_REPLACE uses regex patterns to find and remove characters. The pattern [()-] means “any character in this set.” The empty string as replacement effectively deletes them. The 'g' flag ensures ALL occurrences are removed, not just the first.

Different Approaches & When to Use Them

Approach 1: Specific Characters (Best for Known Set)

SELECT
    text,
    REGEXP_REPLACE(text, '[{}()\[\]]', '', 'g') as remove_brackets
FROM strings;

Good when you know exactly what to remove.

Best for: Cleaning specific known characters.

Approach 2: Keep Only What You Want

SELECT
    text,
    REGEXP_REPLACE(text, '[^a-zA-Z0-9]', '', 'g') as alphanumeric_only,
    REGEXP_REPLACE(text, '[^0-9]', '', 'g') as digits_only
FROM strings;

Using negated character class [^...] to keep only specific types.

Best for: Extract digits or alphanumeric characters.

Approach 3: Chained REPLACE Calls

SELECT
    text,
    REPLACE(REPLACE(REPLACE(text, '(', ''), ')', ''), '-', '') as no_symbols
FROM strings;

Simple for a few fixed characters.

Best for: Just 1-3 characters to remove.

Approach 4: Remove Whitespace

SELECT
    text,
    REGEXP_REPLACE(text, '\s+', '', 'g') as no_spaces,
    REGEXP_REPLACE(text, '^\s+|\s+$', '', 'g') as trim_edges_only
FROM strings;

\s+ matches any whitespace (spaces, tabs, newlines).

Approach 5: Remove Duplicate Characters

SELECT
    text,
    REGEXP_REPLACE(text, '(.)\1+', '\1', 'g') as no_duplicates
FROM strings;

Removes runs of duplicate characters.

Best for: Cleaning up “sstttuufffering” → “stufering”.

Real-World Example: Normalize Phone Numbers

Extract just digits from various phone formats:

SELECT
    customer_id,
    name,
    phone,
    REGEXP_REPLACE(phone, '[^0-9]', '', 'g') as digits_only,
    CONCAT(
        SUBSTRING(REGEXP_REPLACE(phone, '[^0-9]', '', 'g'), 1, 3),
        '-',
        SUBSTRING(REGEXP_REPLACE(phone, '[^0-9]', '', 'g'), 4, 3),
        '-',
        SUBSTRING(REGEXP_REPLACE(phone, '[^0-9]', '', 'g'), 7)
    ) as normalized
FROM customers
WHERE phone IS NOT NULL;

Update Columns

Clean data permanently:

UPDATE customers
SET phone = REGEXP_REPLACE(phone, '[^0-9+]', '', 'g')
WHERE phone IS NOT NULL;

SELECT customer_id, phone FROM customers LIMIT 5;

Performance Notes

Method Speed Best For
REGEXP_REPLACE Fast Pattern-based removal
Chained REPLACE Very Fast Few specific chars
Negated pattern Fast Keep specific types

REGEXP_REPLACE is highly optimized.

Common Patterns

-- Remove all punctuation
REGEXP_REPLACE(text, '[^\w\s]', '', 'g')

-- Remove all numbers
REGEXP_REPLACE(text, '[0-9]', '', 'g')

-- Remove all letters
REGEXP_REPLACE(text, '[a-zA-Z]', '', 'g')

-- Remove non-printable characters
REGEXP_REPLACE(text, '[\x00-\x1F]', '', 'g')

-- Remove leading/trailing slashes
REGEXP_REPLACE(text, '^/|/$', '', 'g')

-- Remove HTML tags
REGEXP_REPLACE(text, '<[^>]+>', '', 'g')

-- Keep only word characters
REGEXP_REPLACE(text, '[^\w]', '', 'g')

FAQ

Q: What’s the difference between REPLACE and REGEXP_REPLACE? A: REPLACE matches exact strings. REGEXP_REPLACE uses regex patterns. Use REPLACE for simple cases, REGEXP_REPLACE for complex patterns.

Q: How do I escape special regex characters? A: Use backslash: \. for literal period, \( for literal parenthesis.

Q: Can I remove case-insensitive? A: Yes, add ‘i’ flag: REGEXP_REPLACE(text, 'pattern', '', 'gi').

Q: What does the ‘g’ flag mean? A: Global—replace all occurrences. Without it, only the first match is replaced.

Q: Performance for very large strings? A: Still fast. PostgreSQL optimizes REGEXP_REPLACE well. No concerns unless processing millions of rows.

Q: How do I keep only digits? A: REGEXP_REPLACE(text, '[^0-9]', '', 'g') keeps only 0-9.

Wrapping Up

Remove specific characters using REGEXP_REPLACE with patterns. For known character sets, use simple patterns like [()-]. For categories (digits, letters, whitespace), use character classes like [^0-9].

The key: patterns are powerful and flexible. Master the common ones and you’ll clean any data format.