PostgreSQL Arrays: Complete Guide to Array Types

PostgreSQL arrays store multiple values in a single column. They’re useful for semi-structured data like skill lists, tag collections, or related IDs. But querying arrays requires specific functions - ANY(), unnest(), array_agg().

This guide shows how to work with arrays effectively.

What You’ll Learn

This guide covers:

  • Creating arrays (ARRAY[], ARRAY constructor)
  • Accessing array elements by index
  • Searching with ANY() and contains
  • Unnesting arrays to rows
  • Aggregating to arrays
  • Performance considerations
  • When to use arrays vs. normalization

Creating Arrays

ARRAY Literal Syntax

CREATE TABLE employee_skills (
    emp_id INTEGER,
    first_name VARCHAR(50),
    skills TEXT[]
);

INSERT INTO employee_skills VALUES
(1, 'James', ARRAY['Python', 'SQL', 'JavaScript']),
(2, 'Mary', ARRAY['Java', 'Spring', 'Kubernetes']),
(3, 'David', ARRAY['PostgreSQL', 'Linux', 'Docker', 'AWS']),
(4, 'Jennifer', ARRAY['Python', 'Machine Learning', 'TensorFlow']);

SELECT * FROM employee_skills;

Result:

emp_id | first_name |                    skills
-------+------------+----------------------------------------------
      1 | James      | {Python,SQL,JavaScript}
      2 | Mary       | {Java,Spring,Kubernetes}
      3 | David      | {PostgreSQL,Linux,Docker,AWS}
      4 | Jennifer   | {Python,Machine Learning,TensorFlow}

Use ARRAY[...] to create array literals. The data type is TEXT[].

ARRAY() Constructor

Convert query results to arrays.

SELECT
    emp_id,
    first_name,
    ARRAY(SELECT DISTINCT department
          FROM employee_history
          WHERE emp_id = employees.emp_id) as departments_worked
FROM employees
WHERE dept_id = 2
LIMIT 2;

Constructs an array from subquery results.

String Splitting to Array

Convert comma-separated strings to arrays.

SELECT
    'Python,SQL,PostgreSQL' as skills_string,
    STRING_TO_ARRAY('Python,SQL,PostgreSQL', ',') as skills_array,
    ARRAY_LENGTH(STRING_TO_ARRAY('Python,SQL,PostgreSQL', ','), 1) as skill_count;

Result:

       skills_string      |      skills_array      | skill_count
--------------------------+------------------------+-------------
Python,SQL,PostgreSQL     | {Python,SQL,PostgreSQL}|           3

Accessing Array Elements

Index-Based Access

Arrays are 1-indexed in PostgreSQL (not 0-indexed like most languages).

SELECT
    emp_id,
    first_name,
    skills,
    skills[1] as first_skill,
    skills[2] as second_skill,
    skills[3] as third_skill
FROM employee_skills
LIMIT 3;

Result:

emp_id | first_name | skills                | first_skill | second_skill | third_skill
-------+------------+---------------------+-------------+--------------+-------------
      1 | James      | {Python,SQL,Java}   | Python      | SQL          | Java
      2 | Mary       | {Java,Spring,Kube}  | Java        | Spring       | Kube
      3 | David      | {Postgres,Linux,..} | Postgres    | Linux        | (null)

When accessing out-of-bounds index, you get NULL.

Array Slicing

Get a range of elements.

SELECT
    first_name,
    skills,
    skills[1:2] as first_two_skills,
    skills[2:4] as middle_skills,
    skills[3:] as from_third_onward
FROM employee_skills
LIMIT 2;

Result:

first_name | skills          | first_two_skills | middle_skills | from_third_onward
----------+----------------+-----------------+--------------+-----------------
James      | {Python,SQL,JS}| {Python,SQL}    | {SQL,JS}     | {JS}
Mary       | {Java,Spring,K}| {Java,Spring}   | {Spring,K}   | {K}

Syntax: array[start:end] (inclusive on both ends).

Searching Arrays

ANY() for Membership

Check if a value exists in the array.

-- Find employees who know Python
SELECT
    emp_id,
    first_name,
    skills
FROM employee_skills
WHERE 'Python' = ANY(skills);

Result:

emp_id | first_name | skills
-------+------------+----------------------------------
      1 | James      | {Python,SQL,JavaScript}
      4 | Jennifer   | {Python,Machine Learning,...}

<@ and @> Operators

Check array containment.

-- <@ checks if array is contained in another
SELECT
    first_name,
    skills,
    ARRAY['Python', 'SQL'] <@ skills as knows_python_and_sql
FROM employee_skills;

-- @> checks if array contains another
SELECT
    first_name,
    skills,
    skills @> ARRAY['Python'] as knows_python,
    skills @> ARRAY['Java', 'Spring'] as knows_java_stack
FROM employee_skills;

Result:

first_name | knows_python | knows_java_stack
----------+--------------+------------------
James      | t            | f
Mary       | f            | t
David      | f            | f
Jennifer   | t            | f

&& Operator (Overlaps)

Check if two arrays have any common elements.

SELECT
    first_name,
    skills,
    skills && ARRAY['Python', 'Java'] as knows_python_or_java
FROM employee_skills;

Result:

first_name | skills              | knows_python_or_java
----------+-------------------+---------------------
James      | {Python,SQL,JS}    | t
Mary       | {Java,Spring}      | t
David      | {Postgres,Linux}   | f
Jennifer   | {Python,ML,Tensor} | t

Array Functions

ARRAY_LENGTH

Get array dimensions.

SELECT
    first_name,
    skills,
    ARRAY_LENGTH(skills, 1) as num_skills
FROM employee_skills
ORDER BY num_skills DESC;

Result:

first_name | skills                | num_skills
----------+---------------------+----------
David      | {Postgres,Linux,..} |         4
James      | {Python,SQL,JS}     |         3
Mary       | {Java,Spring,K}     |         3
Jennifer   | {Python,ML,Tensor}  |         3

ARRAY_APPEND and ARRAY_PREPEND

Add elements to arrays.

SELECT
    first_name,
    skills,
    ARRAY_APPEND(skills, 'Docker') as with_docker,
    ARRAY_PREPEND('Bash', skills) as with_bash_first
FROM employee_skills
LIMIT 2;

Result:

first_name | skills        | with_docker           | with_bash_first
----------+---------------+----------------------+----------------
James      | {Python,SQL}  | {Python,SQL,Docker}   | {Bash,Python,SQL}
Mary       | {Java,Spring} | {Java,Spring,Docker}  | {Bash,Java,Spring}

ARRAY_REMOVE

Remove elements from arrays.

SELECT
    first_name,
    skills,
    ARRAY_REMOVE(skills, 'SQL') as without_sql
FROM employee_skills
WHERE 'SQL' = ANY(skills);

ARRAY_AGG: Aggregate to Array

Collect values into an array.

SELECT
    d.dept_name,
    ARRAY_AGG(e.first_name) as employees,
    ARRAY_AGG(DISTINCT e.first_name) as unique_employees,
    ARRAY_AGG(e.salary ORDER BY e.salary DESC) as salaries_high_to_low
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
GROUP BY d.dept_id, d.dept_name;

Result:

   dept_name    |        employees         |        salaries_high_to_low
-----------------+------------------------+-------------------------------
Sales           | {James,Mary,Robert,...} | {78000,75000,72000,...}
Engineering     | {David,Jennifer,...}    | {105000,102000,98000,...}

Unnesting Arrays to Rows

UNNEST()

Expand array into rows.

SELECT
    emp_id,
    first_name,
    UNNEST(skills) as skill
FROM employee_skills;

Result:

emp_id | first_name | skill
-------+------------+------------------
      1 | James      | Python
      1 | James      | SQL
      1 | James      | JavaScript
      2 | Mary       | Java
      2 | Mary       | Spring
      2 | Mary       | Kubernetes

One row per skill. Used to convert array data to normalized form for analysis.

UNNEST with Position

Get both value and position.

SELECT
    emp_id,
    first_name,
    pos,
    skill
FROM employee_skills,
UNNEST(skills) WITH ORDINALITY AS t(skill, pos)
LIMIT 6;

Result:

emp_id | first_name | pos | skill
-------+------------+-----+------------------
      1 | James      |  1  | Python
      1 | James      |  2  | SQL
      1 | James      |  3  | JavaScript
      2 | Mary       |  1  | Java
      2 | Mary       |  2  | Spring
      2 | Mary       |  3  | Kubernetes

The WITH ORDINALITY clause adds position numbers.

Join on Unnested Array

Find which departments need Python training.

SELECT
    DISTINCT d.dept_name,
    UNNEST(es.skills) as required_skill
FROM employee_skills es
JOIN employees e ON es.emp_id = e.emp_id
JOIN departments d ON e.dept_id = d.dept_id
WHERE UNNEST(es.skills) = 'Python'
ORDER BY d.dept_name;

This finds departments where employees know Python.

Real-World Array Patterns

Store tags as arrays, search across them.

CREATE TABLE articles (
    article_id INTEGER,
    title VARCHAR(255),
    tags TEXT[]
);

INSERT INTO articles VALUES
(1, 'PostgreSQL Performance Tips', ARRAY['postgres', 'performance', 'database']),
(2, 'Arrays in PostgreSQL', ARRAY['postgres', 'arrays', 'tips']),
(3, 'JSON Best Practices', ARRAY['json', 'postgres', 'bestpractices']);

-- Articles tagged with 'postgres' AND 'tips'
SELECT
    article_id,
    title,
    tags
FROM articles
WHERE tags @> ARRAY['postgres', 'tips'];

-- Articles tagged with 'postgres' OR 'json'
SELECT
    article_id,
    title,
    tags
FROM articles
WHERE tags && ARRAY['postgres', 'json'];

String Aggregation

Combine multiple rows into comma-separated list.

SELECT
    d.dept_name,
    STRING_AGG(e.first_name, ', ') as employee_list,
    ARRAY_AGG(e.first_name) as employee_array
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
GROUP BY d.dept_id, d.dept_name;

Result:

   dept_name    |       employee_list        |       employee_array
-----------------+----------------------------+------------------------
Sales           | James, Mary, Robert, ...   | {James,Mary,Robert,...}
Engineering     | David, Jennifer, Richard   | {David,Jennifer,...}

Both create multi-value output from multiple rows.

Common Errors & Solutions

Error: “malformed array literal”

Problem:

INSERT INTO employee_skills VALUES (1, 'James', 'Python, SQL, Java');

Using string instead of ARRAY.

Solution:

Use ARRAY[] or STRING_TO_ARRAY:

INSERT INTO employee_skills VALUES (1, 'James', ARRAY['Python', 'SQL', 'Java']);
-- Or
INSERT INTO employee_skills VALUES
(1, 'James', STRING_TO_ARRAY('Python,SQL,Java', ','));

Error: “Array index out of bounds”

Problem:

SELECT skills[10] FROM employee_skills WHERE emp_id = 1;

Accessing beyond array length.

Solution:

Check length first:

SELECT
    CASE
        WHEN ARRAY_LENGTH(skills, 1) >= 10 THEN skills[10]
        ELSE 'Not available'
    END as tenth_skill
FROM employee_skills;

Unexpected NULL from UNNEST

Problem:

SELECT UNNEST(ARRAY[]::TEXT[]);  -- Empty array

Returns no rows (not NULL).

Solution:

Empty arrays unnest to nothing. Use COALESCE:

SELECT COALESCE(UNNEST(ARRAY[]::TEXT[]), 'No values');

Performance Tips

1. Use Arrays for Small Collections Only

If you’re always unnesting for queries, normalization (separate table) is better:

-- Arrays: Good for <20 items per row
CREATE TABLE user_preferences (
    user_id INTEGER,
    favorite_categories TEXT[]  -- Small list
);

-- Normalization: Better for large lists
CREATE TABLE user_categories (
    user_id INTEGER,
    category_id INTEGER
);

2. Index Arrays with GIN

CREATE INDEX idx_skills_gin ON employee_skills USING GIN(skills);

Now queries using ANY() or @> use the index.

3. Avoid Unnesting in WHERE

-- Slow: Unnests for every row check
WHERE UNNEST(skills) = 'Python'

-- Better: Use ANY
WHERE 'Python' = ANY(skills)

4. Aggregate Efficiently

-- Efficient: ARRAY_AGG is fast
SELECT ARRAY_AGG(col) FROM table GROUP BY id;

-- Less efficient: ARRAY constructor with subquery
SELECT ARRAY(SELECT col FROM table WHERE id = outer_id)

FAQ

Q: Should I use arrays or a separate table? A: Arrays for small, fixed collections (< 20 items). Separate table for unbounded lists or when you need to query individual items frequently.

Q: Are arrays slower than separate tables? A: Array queries are usually fast for small collections. Unnesting can be slow for very large arrays.

Q: How do I update an array element? A: Use array[index] = new_value: UPDATE t SET arr[1] = 'new' WHERE id = 1

Q: Can I have multidimensional arrays? A: Yes: INTEGER[][] for 2D arrays. But usually it’s better to normalize.

Q: How do I sort array elements? A: ARRAY_AGG with ORDER BY: ARRAY_AGG(col ORDER BY col ASC)

Q: What’s the difference between <@ and @>? A: <@ checks if left is contained in right. @> checks if left contains right.

Continue mastering data types:

Arrays are powerful for semi-structured data. Choose them wisely and you can store related data flexibly.