PostgreSQL Deep Dive Part 9 — Complex Data Types — Network, JSON, Arrays, Text Search and more | Abhishek Kumar Singh

Access complete, unfiltered study notes maintained on GitHub for easily maintaing Second Brain Dir

PostgreSQL's rich type system includes several sophisticated data types that enable complex data modeling and efficient operations. This comprehensive guide explores these advanced data types, their implementations, and best practices.

Network Address Types

PostgreSQL provides specialized types for storing and manipulating network addresses, offering both IPv4 and IPv6 support through the inet and macaddr types.

INET and Network Operations

sql

CREATE TABLE inet_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    ip_address inet
);
 
INSERT INTO inet_example (ip_address)
VALUES
    ('192.168.1.10/24'), -- Host address with subnet mask
    ('10.0.0.1'), -- Host address without subnet
    ('::1/128'), -- IPv6 loopback address
    ('2001:db8::/32'), -- IPv6 network
    ('2001:db8:85a3:8d3:1319:8a2e:370:7348'); -- IPv6 host address

sql

CREATE TABLE inet_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    ip_address inet
);
 
INSERT INTO inet_example (ip_address)
VALUES
    ('192.168.1.10/24'), -- Host address with subnet mask
    ('10.0.0.1'), -- Host address without subnet
    ('::1/128'), -- IPv6 loopback address
    ('2001:db8::/32'), -- IPv6 network
    ('2001:db8:85a3:8d3:1319:8a2e:370:7348'); -- IPv6 host address

The INET type offers significant storage efficiency compared to text storage:

sql

SELECT
    pg_column_size('2001:db8:85a3:8d3:1319:8a2e:370:7348'::text) as text_size,
    pg_column_size('2001:db8:85a3:8d3:1319:8a2e:370:7348'::inet) as inet_size;
/*
| text_size | inet_size  |
|-----------|------------|
|    40     |    22      |
*/

sql

SELECT
    pg_column_size('2001:db8:85a3:8d3:1319:8a2e:370:7348'::text) as text_size,
    pg_column_size('2001:db8:85a3:8d3:1319:8a2e:370:7348'::inet) as inet_size;
/*
| text_size | inet_size  |
|-----------|------------|
|    40     |    22      |
*/

Network Address Functions

PostgreSQL provides several functions for working with network addresses:

sql

-- converted the query to json format for display constraint
SELECT json_agg(t)
FROM (
	-- main query
 SELECT
    ip_address,
    host(ip_address) as host_only, -- Extracts the host part
    masklen(ip_address) as mask_length, -- Extracts the prefix length
    network(ip_address) as network_only, -- Extracts the network portion
    abbrev(ip_address) as abbreviated_ip -- Abbreviates IPv6 address
FROM inet_example) t;

sql

-- converted the query to json format for display constraint
SELECT json_agg(t)
FROM (
	-- main query
 SELECT
    ip_address,
    host(ip_address) as host_only, -- Extracts the host part
    masklen(ip_address) as mask_length, -- Extracts the prefix length
    network(ip_address) as network_only, -- Extracts the network portion
    abbrev(ip_address) as abbreviated_ip -- Abbreviates IPv6 address
FROM inet_example) t;

Results:

json

[
  {
    "ip_address": "192.168.1.10/24",
    "host_only": "192.168.1.10",
    "mask_length": 24,
    "network_only": "192.168.1.0/24",
    "abbreviated_ip": "192.168.1.10/24"
  },
  {
    "ip_address": "10.0.0.1",
    "host_only": "10.0.0.1",
    "mask_length": 32,
    "network_only": "10.0.0.1/32",
    "abbreviated_ip": "10.0.0.1"
  },
  {
    "ip_address": "::1",
    "host_only": "::1",
    "mask_length": 128,
    "network_only": "::1/128",
    "abbreviated_ip": "::1"
  },
  {
    "ip_address": "2001:db8::/32",
    "host_only": "2001:db8::",
    "mask_length": 32,
    "network_only": "2001:db8::/32",
    "abbreviated_ip": "2001:db8::/32"
  },
  {
    "ip_address": "2001:db8:85a3:8d3:1319:8a2e:370:7348",
    "host_only": "2001:db8:85a3:8d3:1319:8a2e:370:7348",
    "mask_length": 128,
    "network_only": "2001:db8:85a3:8d3:1319:8a2e:370:7348/128",
    "abbreviated_ip": "2001:db8:85a3:8d3:1319:8a2e:370:7348"
  }
]

json

[
  {
    "ip_address": "192.168.1.10/24",
    "host_only": "192.168.1.10",
    "mask_length": 24,
    "network_only": "192.168.1.0/24",
    "abbreviated_ip": "192.168.1.10/24"
  },
  {
    "ip_address": "10.0.0.1",
    "host_only": "10.0.0.1",
    "mask_length": 32,
    "network_only": "10.0.0.1/32",
    "abbreviated_ip": "10.0.0.1"
  },
  {
    "ip_address": "::1",
    "host_only": "::1",
    "mask_length": 128,
    "network_only": "::1/128",
    "abbreviated_ip": "::1"
  },
  {
    "ip_address": "2001:db8::/32",
    "host_only": "2001:db8::",
    "mask_length": 32,
    "network_only": "2001:db8::/32",
    "abbreviated_ip": "2001:db8::/32"
  },
  {
    "ip_address": "2001:db8:85a3:8d3:1319:8a2e:370:7348",
    "host_only": "2001:db8:85a3:8d3:1319:8a2e:370:7348",
    "mask_length": 128,
    "network_only": "2001:db8:85a3:8d3:1319:8a2e:370:7348/128",
    "abbreviated_ip": "2001:db8:85a3:8d3:1319:8a2e:370:7348"
  }
]

MAC Addresses

PostgreSQL supports both 6-byte and 8-byte MAC addresses:

sql

SELECT '08:00:2b:01:02:03'::macaddr; -- 6-byte version
SELECT '08:00:2b:01:02:03'::macaddr8; -- 8-byte version
/*
| macaddr           | macaddr8                |
|-------------------|-------------------------|
| 08:00:2b:01:02:03 | 08:00:2b:ff:fe:01:02:03 |
*/

sql

SELECT '08:00:2b:01:02:03'::macaddr; -- 6-byte version
SELECT '08:00:2b:01:02:03'::macaddr8; -- 8-byte version
/*
| macaddr           | macaddr8                |
|-------------------|-------------------------|
| 08:00:2b:01:02:03 | 08:00:2b:ff:fe:01:02:03 |
*/

JSON and JSONB Types

PostgreSQL offers two JSON types: json and jsonb. While both store JSON data, jsonb offers significant advantages for most use cases.

JSON vs JSONB Comparison

sql

-- Size comparison showing JSONB's efficiency with larger content
SELECT
    pg_column_size('{"data": "small"}'::json) as json_small,
    pg_column_size('{"data": "small"}'::jsonb) as jsonb_small,
    pg_column_size(jsonb_build_object('data', repeat('x', 1000))::json)
      as json_large,
    pg_column_size(jsonb_build_object('data', repeat('x', 1000))::jsonb)
      as jsonb_large;
/*
| json_small | jsonb_small | json_large | jsonb_large |
|------------|-------------|------------|-------------|
|         21 |          25 |       1016 |        1020 |
 
While JSONB has more overhead for small values, it becomes more efficient
with larger content due to internal compression and binary storage
*/
 
-- Formatting comparison
SELECT
    '{"a" : "Hello world"}'::json,
    '{"a" : "Hello world"}'::jsonb;
/*
| json                  | jsonb                |
|-----------------------|----------------------|
| {"a" : "Hello world"} | {"a": "Hello world"} |
*/
 
-- Space handling demonstration
SELECT
    '{"a" :      "Hello world"}'::json,
    '{"a" :      "Hello world"}'::jsonb;
/*
| json                       | jsonb                |
|----------------------------|----------------------|
| {"a" :      "Hello world"} | {"a": "Hello world"} |
*/

sql

-- Size comparison showing JSONB's efficiency with larger content
SELECT
    pg_column_size('{"data": "small"}'::json) as json_small,
    pg_column_size('{"data": "small"}'::jsonb) as jsonb_small,
    pg_column_size(jsonb_build_object('data', repeat('x', 1000))::json)
      as json_large,
    pg_column_size(jsonb_build_object('data', repeat('x', 1000))::jsonb)
      as jsonb_large;
/*
| json_small | jsonb_small | json_large | jsonb_large |
|------------|-------------|------------|-------------|
|         21 |          25 |       1016 |        1020 |
 
While JSONB has more overhead for small values, it becomes more efficient
with larger content due to internal compression and binary storage
*/
 
-- Formatting comparison
SELECT
    '{"a" : "Hello world"}'::json,
    '{"a" : "Hello world"}'::jsonb;
/*
| json                  | jsonb                |
|-----------------------|----------------------|
| {"a" : "Hello world"} | {"a": "Hello world"} |
*/
 
-- Space handling demonstration
SELECT
    '{"a" :      "Hello world"}'::json,
    '{"a" :      "Hello world"}'::jsonb;
/*
| json                       | jsonb                |
|----------------------------|----------------------|
| {"a" :      "Hello world"} | {"a": "Hello world"} |
*/

Working with JSON Data

sql

SELECT '
{
    "string": "Hello World",
    "number": 42,
    "boolean": true,
    "null": null,
    "array": [1,2,3],
    "object": {
        "key": "value"
    }
}'::jsonb->'string';
-- Returns: "Hello World"
-- for unquoted text, use '->>'
 
-- Access nested values
SELECT '{...}'::json->'object'->'key';

sql

SELECT '
{
    "string": "Hello World",
    "number": 42,
    "boolean": true,
    "null": null,
    "array": [1,2,3],
    "object": {
        "key": "value"
    }
}'::jsonb->'string';
-- Returns: "Hello World"
-- for unquoted text, use '->>'
 
-- Access nested values
SELECT '{...}'::json->'object'->'key';

Key differences between JSON and JSONB:

JSONB is binary stored and parsed at insertion
JSONB enforces unique keys
JSONB removes whitespace
JSONB typically performs better for queries
JSONB supports indexing

Arrays

PostgreSQL's array support enables storing multiple values in a single column.

Array Creation and Manipulation

sql

CREATE TABLE array_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    int_array integer[],-- Using [] notation
    text_array text ARRAY,-- Using ARRAY keyword
    bool_array boolean[],
    nested_array integer[][]-- Multi-dimensional array
);
 
-- Inserting arrays using ARRAY constructor
INSERT INTO array_example
    (int_array, text_array, bool_array)
VALUES
    (
        ARRAY[1,2,3,4],
        ARRAY['marigold', 'daisy', 'poppy', 'sunflower'],
        ARRAY[true, false, true, false]
    );
 
-- Using curly brace notation for nested arrays
INSERT INTO array_example
    (nested_array)
VALUES
    ('{{1,2,3}, {4,5,6}, {7,8,9}}');
-- Note: curly brace notation works with normal array as well

sql

CREATE TABLE array_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    int_array integer[],-- Using [] notation
    text_array text ARRAY,-- Using ARRAY keyword
    bool_array boolean[],
    nested_array integer[][]-- Multi-dimensional array
);
 
-- Inserting arrays using ARRAY constructor
INSERT INTO array_example
    (int_array, text_array, bool_array)
VALUES
    (
        ARRAY[1,2,3,4],
        ARRAY['marigold', 'daisy', 'poppy', 'sunflower'],
        ARRAY[true, false, true, false]
    );
 
-- Using curly brace notation for nested arrays
INSERT INTO array_example
    (nested_array)
VALUES
    ('{{1,2,3}, {4,5,6}, {7,8,9}}');
-- Note: curly brace notation works with normal array as well

Array Operations

sql

-- Array indexing (1-based)
SELECT
    id, text_array[1]-- Returns: 'marigold'
FROM array_example;
 
-- Array slicing
SELECT
    id, text_array[1:3]-- Returns: {'marigold', 'daisy', 'poppy'}
FROM array_example;
 
-- Array contains operator (@>)
SELECT
    id, text_array
FROM array_example
WHERE
    text_array @> ARRAY['poppy']; -- same will work with {} - '{poppy}'
/*
| id | text_array                       |
|----|----------------------------------|
|  1 | {marigold,daisy,poppy,sunflower} |
*/
 
-- Array unnesting
SELECT id, unnest(text_array) -- divides the array into multiple rows
FROM array_example;
/*
| id | unnest    |
|----|-----------|
| 1  | marigold  |
| 1  | daisy     |
| 1  | poppy     |
| 1  | sunflower |
*/
 
-- Common Table Expression with unnested array
WITH flowers AS (
    SELECT id, unnest(text_array) as flower
    FROM array_example
)
SELECT * FROM flowers WHERE flower = 'poppy';

sql

-- Array indexing (1-based)
SELECT
    id, text_array[1]-- Returns: 'marigold'
FROM array_example;
 
-- Array slicing
SELECT
    id, text_array[1:3]-- Returns: {'marigold', 'daisy', 'poppy'}
FROM array_example;
 
-- Array contains operator (@>)
SELECT
    id, text_array
FROM array_example
WHERE
    text_array @> ARRAY['poppy']; -- same will work with {} - '{poppy}'
/*
| id | text_array                       |
|----|----------------------------------|
|  1 | {marigold,daisy,poppy,sunflower} |
*/
 
-- Array unnesting
SELECT id, unnest(text_array) -- divides the array into multiple rows
FROM array_example;
/*
| id | unnest    |
|----|-----------|
| 1  | marigold  |
| 1  | daisy     |
| 1  | poppy     |
| 1  | sunflower |
*/
 
-- Common Table Expression with unnested array
WITH flowers AS (
    SELECT id, unnest(text_array) as flower
    FROM array_example
)
SELECT * FROM flowers WHERE flower = 'poppy';

Text Search Types

PostgreSQL's full-text search capabilities are built around the tsvector and tsquery types.

Basic Text Search Operations

A tsvector is a specialized PostgreSQL type that represents a document in a form optimized for text search. It contains a sorted list of distinct lexemes - words that have been normalized to merge different variants of the same word (e.g., running → run). During creation, the text undergoes several transformations:

Breaking text into tokens
Converting tokens to lexemes
Removing stop words (common words like 'the', 'a', 'is')
Sorting and eliminating duplicates

sql

-- Creating a tsvector - notice how common words are removed
-- and words are normalized to their root form
SELECT to_tsvector('the quick brown fox jumps over the lazy dog');
/*
| to_tsvector                                           |
|-------------------------------------------------------|
| 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2 |
*/
 
-- The numbers after each lexeme indicate their position in the original text
-- This helps with phrase searches and proximity queries
 
-- Type checking shows the specialized tsvector type
SELECT pg_typeof(to_tsvector('the quick brown fox jumps over the lazy dog'));
-- Returns: tsvector
 
-- Text search matching using @@ operator
-- tsquery converts search terms into a searchable format
SELECT
	to_tsvector('the quick brown fox jumps over the lazy dog')
	@@ to_tsquery('lazy');
-- Returns: TRUE
 
-- Language-specific text search demonstrates how different languages
-- handle word normalization and stop words
SELECT to_tsvector('english', 'yes');-- Returns: 'yes':1
SELECT to_tsvector('french', 'oui');-- Returns: 'oui':1

sql

-- Creating a tsvector - notice how common words are removed
-- and words are normalized to their root form
SELECT to_tsvector('the quick brown fox jumps over the lazy dog');
/*
| to_tsvector                                           |
|-------------------------------------------------------|
| 'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2 |
*/
 
-- The numbers after each lexeme indicate their position in the original text
-- This helps with phrase searches and proximity queries
 
-- Type checking shows the specialized tsvector type
SELECT pg_typeof(to_tsvector('the quick brown fox jumps over the lazy dog'));
-- Returns: tsvector
 
-- Text search matching using @@ operator
-- tsquery converts search terms into a searchable format
SELECT
	to_tsvector('the quick brown fox jumps over the lazy dog')
	@@ to_tsquery('lazy');
-- Returns: TRUE
 
-- Language-specific text search demonstrates how different languages
-- handle word normalization and stop words
SELECT to_tsvector('english', 'yes');-- Returns: 'yes':1
SELECT to_tsvector('french', 'oui');-- Returns: 'oui':1

Implementing Text Search in Tables

sql

CREATE TABLE ts_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    content text,
    search_vector_en TSVECTOR GENERATED ALWAYS AS
	    (to_tsvector('english', content)) STORED
);
 
INSERT INTO ts_example (content)
VALUES
    ('the quick brown fox jumps over the lazy dog'),
    ('the quick brown fox jumps over the cat');
 
-- Search using generated column
SELECT json_agg(t)
FROM (SELECT * FROM ts_example WHERE search_vector_en @@ to_tsquery('lazy')) t;
/*
[{
  "id": 1,
  "content": "the quick brown fox jumps over the lazy dog",
  "search_vector_en": "'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2"
}]
*/

sql

CREATE TABLE ts_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    content text,
    search_vector_en TSVECTOR GENERATED ALWAYS AS
	    (to_tsvector('english', content)) STORED
);
 
INSERT INTO ts_example (content)
VALUES
    ('the quick brown fox jumps over the lazy dog'),
    ('the quick brown fox jumps over the cat');
 
-- Search using generated column
SELECT json_agg(t)
FROM (SELECT * FROM ts_example WHERE search_vector_en @@ to_tsquery('lazy')) t;
/*
[{
  "id": 1,
  "content": "the quick brown fox jumps over the lazy dog",
  "search_vector_en": "'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2"
}]
*/

Bit Strings

Bit strings are useful for storing binary data or flags.

sql

-- Basic bit string operations
SELECT B'0001', '0001'::BIT(4);
/*
| ?column? | bit  |
|----------|------|
| 0001     | 0001 |
*/
 
-- Bit strings for efficient boolean flag storage
-- Example: Storing 32 boolean flags in a single column
CREATE TABLE user_permissions (
    user_id int PRIMARY KEY,
    flags BIT(32) DEFAULT B'00000000000000000000000000000000'
);
 
-- Setting specific permissions (e.g., can_edit = bit 1, can_delete = bit 2)
UPDATE user_permissions
SET flags = SET_BIT(flags, 0, 1)  -- Set can_edit to true
WHERE user_id = 1;
/*
| user_id | flags                            |
|---------|----------------------------------|
|       1 | 10000000000000000000000000000000 |
|       2 | 00000000000000000000000000000000 |
*/
 
UPDATE user_permissions
SET flags = SET_BIT(flags, 5, 1)  -- Set 5th bit to 1
WHERE user_id = 1;
/*
| user_id | flags                            |
|---------|----------------------------------|
|       1 | 10000100000000000000000000000000 |
|       2 | 00000000000000000000000000000000 |
*/
 
-- Performing bitwise operations
SELECT
    B'1100' & B'1010' as AND_result,  -- 1000
    B'1100' | B'1010' as OR_result,   -- 1110
    B'1100' # B'1010' as XOR_result;  -- 0110
 
-- This approach can save significant storage space:
select pg_column_size('00000000000000000000000000000000'::BIT(32)) as bit,
	   pg_column_size('00000000000000000000000000000000'::text) as text,
	   pg_column_size('1'::BOOLEAN) as bool;
/*
| bit | text | bool |
|-----|------|------|
|  12 |   36 |    1 |
 
for bool, it will be 1 * 32 = 32 bytes
*/

sql

-- Basic bit string operations
SELECT B'0001', '0001'::BIT(4);
/*
| ?column? | bit  |
|----------|------|
| 0001     | 0001 |
*/
 
-- Bit strings for efficient boolean flag storage
-- Example: Storing 32 boolean flags in a single column
CREATE TABLE user_permissions (
    user_id int PRIMARY KEY,
    flags BIT(32) DEFAULT B'00000000000000000000000000000000'
);
 
-- Setting specific permissions (e.g., can_edit = bit 1, can_delete = bit 2)
UPDATE user_permissions
SET flags = SET_BIT(flags, 0, 1)  -- Set can_edit to true
WHERE user_id = 1;
/*
| user_id | flags                            |
|---------|----------------------------------|
|       1 | 10000000000000000000000000000000 |
|       2 | 00000000000000000000000000000000 |
*/
 
UPDATE user_permissions
SET flags = SET_BIT(flags, 5, 1)  -- Set 5th bit to 1
WHERE user_id = 1;
/*
| user_id | flags                            |
|---------|----------------------------------|
|       1 | 10000100000000000000000000000000 |
|       2 | 00000000000000000000000000000000 |
*/
 
-- Performing bitwise operations
SELECT
    B'1100' & B'1010' as AND_result,  -- 1000
    B'1100' | B'1010' as OR_result,   -- 1110
    B'1100' # B'1010' as XOR_result;  -- 0110
 
-- This approach can save significant storage space:
select pg_column_size('00000000000000000000000000000000'::BIT(32)) as bit,
	   pg_column_size('00000000000000000000000000000000'::text) as text,
	   pg_column_size('1'::BOOLEAN) as bool;
/*
| bit | text | bool |
|-----|------|------|
|  12 |   36 |    1 |
 
for bool, it will be 1 * 32 = 32 bytes
*/

Ranges

PostgreSQL's range types support storing and manipulating ranges of values.

Range Types and Operations

PostgreSQL ranges are meta-types that allow you to define bounds for various data types like integers, numerics, dates, and timestamps. Each range has a lower and upper bound that can be either inclusive [] or exclusive ().

Understanding range behavior is crucial:

For discrete types like integers (int4range), PostgreSQL automatically adjusts exclusive upper bounds
- Example: [1,5] is stored as [1,6) because there are no values between 5 and 6 in integers
- This differs from continuous types like numeric (numrange) where [1,5] remains as-is since values like 4.999 are valid

sql

-- Integer ranges
SELECT '[1,5]'::int4range;-- Returns: [1,6)
SELECT '[1,5]'::numrange;-- Returns: [1,5]
 
-- Range constructors
SELECT numrange(1, 5);-- Returns: [1,5)
SELECT int4range(1, 5);-- Returns: [1,5)
SELECT numrange(1, 5, '[]');-- Returns: [1,5]
SELECT int4range(1, 5, '[]');-- Returns: [1,6)
 
-- Creating tables with range columns
CREATE TABLE range_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    int_range int4range,
    num_range numrange,
    date_range daterange,
    ts_range tsrange,
    ts_range tstzrange
);
 
-- Range operations
SELECT * FROM range_example WHERE int_range @> 5;-- Contains
SELECT * FROM range_example WHERE int_range && '[11, 20)';-- Overlaps
SELECT int4range(10, 20, '[]') * int4range(15, 25);-- Intersection
 
-- Multirange support
SELECT '{[3,7), [8,9)}'::int4multirange;
 
-- Testing multirange containment
SELECT '{[3,7), [8,9)}'::int4multirange @> 4;-- true
SELECT '{[3,7), [8,9)}'::int4multirange @> 7;-- false
SELECT '{[3,7), [8,9)}'::int4multirange @> 8;-- true
SELECT '{[3,7), [8,9)}'::int4multirange @> 9;-- false
 
-- Helpful range boundary functions
-- upper(): Get the upper bound of a range
SELECT upper(int4range(1, 5));-- Returns: 5
 
-- upper_inc(): Check if upper bound is inclusive
SELECT upper_inc(int4range(1, 5, '[]'));-- Returns: true
SELECT upper_inc(int4range(1, 5));-- Returns: false
 
-- lower(): Get the lower bound of a range
SELECT lower(int4range(1, 5));-- Returns: 1
 
-- lower_inc(): Check if lower bound is inclusive
SELECT lower_inc(int4range(1, 5));-- Returns: true
SELECT lower_inc('(1,5]'::int4range);-- Returns: false
 
-- Example combining multiple functions
SELECT
    r.range,
    lower(r.range) as lower_bound,
    lower_inc(r.range) as is_lower_inclusive,
    upper(r.range) as upper_bound,
    upper_inc(r.range) as is_upper_inclusive
FROM (VALUES
    ('[1,5]'::int4range),
    ('(2,6)'::int4range),
    ('[3,7)'::int4range)
) as r(range);
/*
| range  | lower_bound | is_lower_inclusive | upper_bound | is_upper_inclusive |
|--------|-------------|--------------------|-------------|--------------------|
| [1,6)  | 1           | true               | 6           | false              |
| (2,6)  | 3           | false              | 6           | false              |
| [3,7)  | 3           | true               | 7           | false              |
*/

sql

-- Integer ranges
SELECT '[1,5]'::int4range;-- Returns: [1,6)
SELECT '[1,5]'::numrange;-- Returns: [1,5]
 
-- Range constructors
SELECT numrange(1, 5);-- Returns: [1,5)
SELECT int4range(1, 5);-- Returns: [1,5)
SELECT numrange(1, 5, '[]');-- Returns: [1,5]
SELECT int4range(1, 5, '[]');-- Returns: [1,6)
 
-- Creating tables with range columns
CREATE TABLE range_example (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    int_range int4range,
    num_range numrange,
    date_range daterange,
    ts_range tsrange,
    ts_range tstzrange
);
 
-- Range operations
SELECT * FROM range_example WHERE int_range @> 5;-- Contains
SELECT * FROM range_example WHERE int_range && '[11, 20)';-- Overlaps
SELECT int4range(10, 20, '[]') * int4range(15, 25);-- Intersection
 
-- Multirange support
SELECT '{[3,7), [8,9)}'::int4multirange;
 
-- Testing multirange containment
SELECT '{[3,7), [8,9)}'::int4multirange @> 4;-- true
SELECT '{[3,7), [8,9)}'::int4multirange @> 7;-- false
SELECT '{[3,7), [8,9)}'::int4multirange @> 8;-- true
SELECT '{[3,7), [8,9)}'::int4multirange @> 9;-- false
 
-- Helpful range boundary functions
-- upper(): Get the upper bound of a range
SELECT upper(int4range(1, 5));-- Returns: 5
 
-- upper_inc(): Check if upper bound is inclusive
SELECT upper_inc(int4range(1, 5, '[]'));-- Returns: true
SELECT upper_inc(int4range(1, 5));-- Returns: false
 
-- lower(): Get the lower bound of a range
SELECT lower(int4range(1, 5));-- Returns: 1
 
-- lower_inc(): Check if lower bound is inclusive
SELECT lower_inc(int4range(1, 5));-- Returns: true
SELECT lower_inc('(1,5]'::int4range);-- Returns: false
 
-- Example combining multiple functions
SELECT
    r.range,
    lower(r.range) as lower_bound,
    lower_inc(r.range) as is_lower_inclusive,
    upper(r.range) as upper_bound,
    upper_inc(r.range) as is_upper_inclusive
FROM (VALUES
    ('[1,5]'::int4range),
    ('(2,6)'::int4range),
    ('[3,7)'::int4range)
) as r(range);
/*
| range  | lower_bound | is_lower_inclusive | upper_bound | is_upper_inclusive |
|--------|-------------|--------------------|-------------|--------------------|
| [1,6)  | 1           | true               | 6           | false              |
| (2,6)  | 3           | false              | 6           | false              |
| [3,7)  | 3           | true               | 7           | false              |
*/

Composite Types

Composite types allow grouping multiple fields into a single column.

sql

-- Creating a composite type
CREATE TYPE address AS (
    number text,
    street text,
    city text,
    state text,
    postal text
);
 
-- Using composite types
SELECT ROW('123', 'Main st', 'Sometown', 'ST', '12345')::address;
/*
| row                               |
|-----------------------------------|
| (123,"Main st",Sometown,ST,12345) |
*/
 
-- Creating tables with composite types
CREATE TABLE addresses (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    addr address
);
 
INSERT INTO addresses
    (addr)
VALUES
    (('123', 'Main st', 'Sometown', 'ST', '12345'));
 
-- Accessing composite type fields
SELECT id, (addr).number FROM addresses;
 
select id, addr.number from addresses; -- throws ERROR

sql

-- Creating a composite type
CREATE TYPE address AS (
    number text,
    street text,
    city text,
    state text,
    postal text
);
 
-- Using composite types
SELECT ROW('123', 'Main st', 'Sometown', 'ST', '12345')::address;
/*
| row                               |
|-----------------------------------|
| (123,"Main st",Sometown,ST,12345) |
*/
 
-- Creating tables with composite types
CREATE TABLE addresses (
    id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    addr address
);
 
INSERT INTO addresses
    (addr)
VALUES
    (('123', 'Main st', 'Sometown', 'ST', '12345'));
 
-- Accessing composite type fields
SELECT id, (addr).number FROM addresses;
 
select id, addr.number from addresses; -- throws ERROR

Best Practices and Performance Considerations

JSON/JSONB Usage:
- Prefer JSONB over JSON for better performance and functionality
- Use JSONB when you need to query the JSON content
- Consider normalized tables for frequently queried structured data
Array Implementation:
- Use arrays for small, fixed-size collections
- Consider junction tables for large or variable-size collections
- Remember that array operations can be slower than normalized tables
Text Search Optimization:
- Create indexes on tsvector columns for better search performance
- Use generated columns for automatic tsvector maintenance
- Consider language-specific text search configurations
Network Address Storage:
- Use inet type instead of text for IP addresses
- Leverage network-specific operators for subnet calculations
Range Type Selection:
- Choose appropriate range types based on your data
- Consider exclusion constraints for overlapping ranges
- Use multi-range types for complex range relationships
Composite Type Design:
- Use composite types to group logically related fields
- Consider the impact on querying and indexing
- Remember that constraints can't be added to nested types

Conclusion

PostgreSQL's complex data types provide powerful tools for modeling sophisticated data structures. Understanding these types and their appropriate use cases is crucial for building efficient and maintainable database systems. Remember to:

Choose the right data type for your specific use case
Consider performance implications of each type
Use appropriate indexes and constraints
Monitor storage and query performance
Balance between normalization and denormalization

These advanced data types make PostgreSQL a versatile choice for complex applications, but they should be used judiciously based on your specific requirements and performance needs.

🦌

PostgreSQL Deep Dive Part 9 — Complex Data Types — Network, JSON, Arrays, Text Search and more