PostgreSQL Arrays: Complete Guide to Array Types
PostgreSQL arrays store multiple values in a single column. They’re useful for semi-structured data like skill lists, tag collections, or related IDs. But querying arrays requires specific functions - ANY(), unnest(), array_agg().
This guide shows how to work with arrays effectively.
What You’ll Learn
This guide covers:
- Creating arrays (ARRAY[], ARRAY constructor)
- Accessing array elements by index
- Searching with ANY() and contains
- Unnesting arrays to rows
- Aggregating to arrays
- Performance considerations
- When to use arrays vs. normalization
Creating Arrays
ARRAY Literal Syntax
CREATE TABLE employee_skills (
emp_id INTEGER,
first_name VARCHAR(50),
skills TEXT[]
);
INSERT INTO employee_skills VALUES
(1, 'James', ARRAY['Python', 'SQL', 'JavaScript']),
(2, 'Mary', ARRAY['Java', 'Spring', 'Kubernetes']),
(3, 'David', ARRAY['PostgreSQL', 'Linux', 'Docker', 'AWS']),
(4, 'Jennifer', ARRAY['Python', 'Machine Learning', 'TensorFlow']);
SELECT * FROM employee_skills;
Result:
emp_id | first_name | skills
-------+------------+----------------------------------------------
1 | James | {Python,SQL,JavaScript}
2 | Mary | {Java,Spring,Kubernetes}
3 | David | {PostgreSQL,Linux,Docker,AWS}
4 | Jennifer | {Python,Machine Learning,TensorFlow}
Use ARRAY[...] to create array literals. The data type is TEXT[].
ARRAY() Constructor
Convert query results to arrays.
SELECT
emp_id,
first_name,
ARRAY(SELECT DISTINCT department
FROM employee_history
WHERE emp_id = employees.emp_id) as departments_worked
FROM employees
WHERE dept_id = 2
LIMIT 2;
Constructs an array from subquery results.
String Splitting to Array
Convert comma-separated strings to arrays.
SELECT
'Python,SQL,PostgreSQL' as skills_string,
STRING_TO_ARRAY('Python,SQL,PostgreSQL', ',') as skills_array,
ARRAY_LENGTH(STRING_TO_ARRAY('Python,SQL,PostgreSQL', ','), 1) as skill_count;
Result:
skills_string | skills_array | skill_count
--------------------------+------------------------+-------------
Python,SQL,PostgreSQL | {Python,SQL,PostgreSQL}| 3
Accessing Array Elements
Index-Based Access
Arrays are 1-indexed in PostgreSQL (not 0-indexed like most languages).
SELECT
emp_id,
first_name,
skills,
skills[1] as first_skill,
skills[2] as second_skill,
skills[3] as third_skill
FROM employee_skills
LIMIT 3;
Result:
emp_id | first_name | skills | first_skill | second_skill | third_skill
-------+------------+---------------------+-------------+--------------+-------------
1 | James | {Python,SQL,Java} | Python | SQL | Java
2 | Mary | {Java,Spring,Kube} | Java | Spring | Kube
3 | David | {Postgres,Linux,..} | Postgres | Linux | (null)
When accessing out-of-bounds index, you get NULL.
Array Slicing
Get a range of elements.
SELECT
first_name,
skills,
skills[1:2] as first_two_skills,
skills[2:4] as middle_skills,
skills[3:] as from_third_onward
FROM employee_skills
LIMIT 2;
Result:
first_name | skills | first_two_skills | middle_skills | from_third_onward
----------+----------------+-----------------+--------------+-----------------
James | {Python,SQL,JS}| {Python,SQL} | {SQL,JS} | {JS}
Mary | {Java,Spring,K}| {Java,Spring} | {Spring,K} | {K}
Syntax: array[start:end] (inclusive on both ends).
Searching Arrays
ANY() for Membership
Check if a value exists in the array.
-- Find employees who know Python
SELECT
emp_id,
first_name,
skills
FROM employee_skills
WHERE 'Python' = ANY(skills);
Result:
emp_id | first_name | skills
-------+------------+----------------------------------
1 | James | {Python,SQL,JavaScript}
4 | Jennifer | {Python,Machine Learning,...}
<@ and @> Operators
Check array containment.
-- <@ checks if array is contained in another
SELECT
first_name,
skills,
ARRAY['Python', 'SQL'] <@ skills as knows_python_and_sql
FROM employee_skills;
-- @> checks if array contains another
SELECT
first_name,
skills,
skills @> ARRAY['Python'] as knows_python,
skills @> ARRAY['Java', 'Spring'] as knows_java_stack
FROM employee_skills;
Result:
first_name | knows_python | knows_java_stack
----------+--------------+------------------
James | t | f
Mary | f | t
David | f | f
Jennifer | t | f
&& Operator (Overlaps)
Check if two arrays have any common elements.
SELECT
first_name,
skills,
skills && ARRAY['Python', 'Java'] as knows_python_or_java
FROM employee_skills;
Result:
first_name | skills | knows_python_or_java
----------+-------------------+---------------------
James | {Python,SQL,JS} | t
Mary | {Java,Spring} | t
David | {Postgres,Linux} | f
Jennifer | {Python,ML,Tensor} | t
Array Functions
ARRAY_LENGTH
Get array dimensions.
SELECT
first_name,
skills,
ARRAY_LENGTH(skills, 1) as num_skills
FROM employee_skills
ORDER BY num_skills DESC;
Result:
first_name | skills | num_skills
----------+---------------------+----------
David | {Postgres,Linux,..} | 4
James | {Python,SQL,JS} | 3
Mary | {Java,Spring,K} | 3
Jennifer | {Python,ML,Tensor} | 3
ARRAY_APPEND and ARRAY_PREPEND
Add elements to arrays.
SELECT
first_name,
skills,
ARRAY_APPEND(skills, 'Docker') as with_docker,
ARRAY_PREPEND('Bash', skills) as with_bash_first
FROM employee_skills
LIMIT 2;
Result:
first_name | skills | with_docker | with_bash_first
----------+---------------+----------------------+----------------
James | {Python,SQL} | {Python,SQL,Docker} | {Bash,Python,SQL}
Mary | {Java,Spring} | {Java,Spring,Docker} | {Bash,Java,Spring}
ARRAY_REMOVE
Remove elements from arrays.
SELECT
first_name,
skills,
ARRAY_REMOVE(skills, 'SQL') as without_sql
FROM employee_skills
WHERE 'SQL' = ANY(skills);
ARRAY_AGG: Aggregate to Array
Collect values into an array.
SELECT
d.dept_name,
ARRAY_AGG(e.first_name) as employees,
ARRAY_AGG(DISTINCT e.first_name) as unique_employees,
ARRAY_AGG(e.salary ORDER BY e.salary DESC) as salaries_high_to_low
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
GROUP BY d.dept_id, d.dept_name;
Result:
dept_name | employees | salaries_high_to_low
-----------------+------------------------+-------------------------------
Sales | {James,Mary,Robert,...} | {78000,75000,72000,...}
Engineering | {David,Jennifer,...} | {105000,102000,98000,...}
Unnesting Arrays to Rows
UNNEST()
Expand array into rows.
SELECT
emp_id,
first_name,
UNNEST(skills) as skill
FROM employee_skills;
Result:
emp_id | first_name | skill
-------+------------+------------------
1 | James | Python
1 | James | SQL
1 | James | JavaScript
2 | Mary | Java
2 | Mary | Spring
2 | Mary | Kubernetes
One row per skill. Used to convert array data to normalized form for analysis.
UNNEST with Position
Get both value and position.
SELECT
emp_id,
first_name,
pos,
skill
FROM employee_skills,
UNNEST(skills) WITH ORDINALITY AS t(skill, pos)
LIMIT 6;
Result:
emp_id | first_name | pos | skill
-------+------------+-----+------------------
1 | James | 1 | Python
1 | James | 2 | SQL
1 | James | 3 | JavaScript
2 | Mary | 1 | Java
2 | Mary | 2 | Spring
2 | Mary | 3 | Kubernetes
The WITH ORDINALITY clause adds position numbers.
Join on Unnested Array
Find which departments need Python training.
SELECT
DISTINCT d.dept_name,
UNNEST(es.skills) as required_skill
FROM employee_skills es
JOIN employees e ON es.emp_id = e.emp_id
JOIN departments d ON e.dept_id = d.dept_id
WHERE UNNEST(es.skills) = 'Python'
ORDER BY d.dept_name;
This finds departments where employees know Python.
Real-World Array Patterns
Tag-Based Search
Store tags as arrays, search across them.
CREATE TABLE articles (
article_id INTEGER,
title VARCHAR(255),
tags TEXT[]
);
INSERT INTO articles VALUES
(1, 'PostgreSQL Performance Tips', ARRAY['postgres', 'performance', 'database']),
(2, 'Arrays in PostgreSQL', ARRAY['postgres', 'arrays', 'tips']),
(3, 'JSON Best Practices', ARRAY['json', 'postgres', 'bestpractices']);
-- Articles tagged with 'postgres' AND 'tips'
SELECT
article_id,
title,
tags
FROM articles
WHERE tags @> ARRAY['postgres', 'tips'];
-- Articles tagged with 'postgres' OR 'json'
SELECT
article_id,
title,
tags
FROM articles
WHERE tags && ARRAY['postgres', 'json'];
String Aggregation
Combine multiple rows into comma-separated list.
SELECT
d.dept_name,
STRING_AGG(e.first_name, ', ') as employee_list,
ARRAY_AGG(e.first_name) as employee_array
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
GROUP BY d.dept_id, d.dept_name;
Result:
dept_name | employee_list | employee_array
-----------------+----------------------------+------------------------
Sales | James, Mary, Robert, ... | {James,Mary,Robert,...}
Engineering | David, Jennifer, Richard | {David,Jennifer,...}
Both create multi-value output from multiple rows.
Common Errors & Solutions
Error: “malformed array literal”
Problem:
INSERT INTO employee_skills VALUES (1, 'James', 'Python, SQL, Java');
Using string instead of ARRAY.
Solution:
Use ARRAY[] or STRING_TO_ARRAY:
INSERT INTO employee_skills VALUES (1, 'James', ARRAY['Python', 'SQL', 'Java']);
-- Or
INSERT INTO employee_skills VALUES
(1, 'James', STRING_TO_ARRAY('Python,SQL,Java', ','));
Error: “Array index out of bounds”
Problem:
SELECT skills[10] FROM employee_skills WHERE emp_id = 1;
Accessing beyond array length.
Solution:
Check length first:
SELECT
CASE
WHEN ARRAY_LENGTH(skills, 1) >= 10 THEN skills[10]
ELSE 'Not available'
END as tenth_skill
FROM employee_skills;
Unexpected NULL from UNNEST
Problem:
SELECT UNNEST(ARRAY[]::TEXT[]); -- Empty array
Returns no rows (not NULL).
Solution:
Empty arrays unnest to nothing. Use COALESCE:
SELECT COALESCE(UNNEST(ARRAY[]::TEXT[]), 'No values');
Performance Tips
1. Use Arrays for Small Collections Only
If you’re always unnesting for queries, normalization (separate table) is better:
-- Arrays: Good for <20 items per row
CREATE TABLE user_preferences (
user_id INTEGER,
favorite_categories TEXT[] -- Small list
);
-- Normalization: Better for large lists
CREATE TABLE user_categories (
user_id INTEGER,
category_id INTEGER
);
2. Index Arrays with GIN
CREATE INDEX idx_skills_gin ON employee_skills USING GIN(skills);
Now queries using ANY() or @> use the index.
3. Avoid Unnesting in WHERE
-- Slow: Unnests for every row check
WHERE UNNEST(skills) = 'Python'
-- Better: Use ANY
WHERE 'Python' = ANY(skills)
4. Aggregate Efficiently
-- Efficient: ARRAY_AGG is fast
SELECT ARRAY_AGG(col) FROM table GROUP BY id;
-- Less efficient: ARRAY constructor with subquery
SELECT ARRAY(SELECT col FROM table WHERE id = outer_id)
FAQ
Q: Should I use arrays or a separate table? A: Arrays for small, fixed collections (< 20 items). Separate table for unbounded lists or when you need to query individual items frequently.
Q: Are arrays slower than separate tables? A: Array queries are usually fast for small collections. Unnesting can be slow for very large arrays.
Q: How do I update an array element?
A: Use array[index] = new_value: UPDATE t SET arr[1] = 'new' WHERE id = 1
Q: Can I have multidimensional arrays?
A: Yes: INTEGER[][] for 2D arrays. But usually it’s better to normalize.
Q: How do I sort array elements?
A: ARRAY_AGG with ORDER BY: ARRAY_AGG(col ORDER BY col ASC)
Q: What’s the difference between <@ and @>?
A: <@ checks if left is contained in right. @> checks if left contains right.
Related Topics
Continue mastering data types:
- PostgreSQL Data Types Reference - Complete type guide
- JSON Functions & Operators - Working with JSON data
- Date Functions & Formatting - Temporal data
Arrays are powerful for semi-structured data. Choose them wisely and you can store related data flexibly.