PostgreSQL Deep Dive Part 8 - Identity Management — Serial Types, Sequences, and Identity Columns
Master PostgreSQL's identity management with serial types, sequences, and identity columns. Learn the evolution of auto-incrementing columns and best practices for unique identifiers.
Published At
12/31/2024
Reading Time
~ 7 min read
Access complete, unfiltered study notes maintained on GitHub for easily maintaing Second Brain Dir
Understanding identity management in PostgreSQL is crucial for database design. This deep dive explores three related but distinct approaches: serial types, sequences, and identity columns. We'll examine how these features evolved and their specific use cases.
Serial Types: The Legacy Approach
While serial types are still widely used, it's important to note that since PostgreSQL 10, they're not the recommended way to create primary keys. Let's understand how they work under the hood.
Understanding Serial Implementation
When you create a serial column, PostgreSQL automatically creates several underlying objects:
-- This simple serial declaration
CREATE TABLE serial_example (
id serial
);
-- Actually expands to these statements under the hood:
CREATE SEQUENCE serial_example_id_seq AS integer;
CREATE TABLE serial_example (
id integer NOT NULL DEFAULT nextval('serial_example_id_seq')
);
ALTER SEQUENCE serial_example_id_seq OWNED BY serial_example.id;
-- This simple serial declaration
CREATE TABLE serial_example (
id serial
);
-- Actually expands to these statements under the hood:
CREATE SEQUENCE serial_example_id_seq AS integer;
CREATE TABLE serial_example (
id integer NOT NULL DEFAULT nextval('serial_example_id_seq')
);
ALTER SEQUENCE serial_example_id_seq OWNED BY serial_example.id;
The OWNED BY
clause is particularly important - it ensures that if you drop the table or column, the associated sequence is also destroyed.
Best Practices with Serial
If you must use serial types, always prefer BIGSERIAL
:
CREATE TABLE serial_example (
id bigserial PRIMARY KEY
);
CREATE TABLE serial_example (
id bigserial PRIMARY KEY
);
This is because BIGSERIAL
uses bigint
, providing a much larger range of values and better future-proofing your application.
Practical Example: Order Number Generation
Here's a practical example using serial for order numbers:
CREATE TABLE orders (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
order_number serial,
customer_name varchar(100),
order_date date,
total_amount numeric(10, 2)
);
-- Insert orders without specifying id or order_number
INSERT INTO orders
(customer_name, order_date, total_amount)
VALUES
('John Doe', '2024-09-24', 150.00),
('Jane Smith', '2024-09-24', 200.50),
('Bob Johnson', '2024-09-25', 75.25);
-- Using RETURNING to get the generated values immediately
INSERT INTO orders
(customer_name, order_date, total_amount)
VALUES
('John Doe', '2024-09-24', 150.00)
RETURNING id, order_number;
/*
| id | order_number |
|----|--------------|
| 4 | 4 |
*/
CREATE TABLE orders (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
order_number serial,
customer_name varchar(100),
order_date date,
total_amount numeric(10, 2)
);
-- Insert orders without specifying id or order_number
INSERT INTO orders
(customer_name, order_date, total_amount)
VALUES
('John Doe', '2024-09-24', 150.00),
('Jane Smith', '2024-09-24', 200.50),
('Bob Johnson', '2024-09-25', 75.25);
-- Using RETURNING to get the generated values immediately
INSERT INTO orders
(customer_name, order_date, total_amount)
VALUES
('John Doe', '2024-09-24', 150.00)
RETURNING id, order_number;
/*
| id | order_number |
|----|--------------|
| 4 | 4 |
*/
Sequences: The Building Blocks
Sequences are the foundation of auto-incrementing columns in PostgreSQL. Understanding them is crucial for advanced identity management.
Creating and Managing Sequences
-- Create a sequence with specific parameters
CREATE SEQUENCE seq
AS bigint
INCREMENT 1
START 1
MINVALUE 1
MAXVALUE 9223372036854775807;
-- Get next value (can be called from multiple sessions)
-- nextval() is thread-safe and advances the sequence
-- Returns a new value and increments the sequence
SELECT nextval('seq');
-- Get current value (session-specific)
-- currval() returns the value most recently obtained by nextval()
-- for this session, without advancing the sequence
-- Throws error if nextval() hasn't been called in this session
SELECT currval('seq');
-- Set sequence to specific value (use with caution)
-- setval() sets the current value and optionally the 'is_called' flag
-- setval('seq', value) -> sets value and marks as called (next nextval returns value+1)
-- setval('seq', value, false) -> sets value as not called (next nextval returns value)
SELECT setval('seq', 1);
-- Example showing sequence behavior
SELECT nextval('seq'); -- Returns 1
SELECT currval('seq'); -- Returns 1
SELECT setval('seq', 5); -- Returns 5
SELECT nextval('seq'); -- Returns 6
SELECT currval('seq'); -- Returns 6
-- Create a sequence with specific parameters
CREATE SEQUENCE seq
AS bigint
INCREMENT 1
START 1
MINVALUE 1
MAXVALUE 9223372036854775807;
-- Get next value (can be called from multiple sessions)
-- nextval() is thread-safe and advances the sequence
-- Returns a new value and increments the sequence
SELECT nextval('seq');
-- Get current value (session-specific)
-- currval() returns the value most recently obtained by nextval()
-- for this session, without advancing the sequence
-- Throws error if nextval() hasn't been called in this session
SELECT currval('seq');
-- Set sequence to specific value (use with caution)
-- setval() sets the current value and optionally the 'is_called' flag
-- setval('seq', value) -> sets value and marks as called (next nextval returns value+1)
-- setval('seq', value, false) -> sets value as not called (next nextval returns value)
SELECT setval('seq', 1);
-- Example showing sequence behavior
SELECT nextval('seq'); -- Returns 1
SELECT currval('seq'); -- Returns 1
SELECT setval('seq', 5); -- Returns 5
SELECT nextval('seq'); -- Returns 6
SELECT currval('seq'); -- Returns 6
Important characteristics of sequences:
- They are thread-safe and support concurrent access
currval()
is session-specificnextval()
is globally incremented- Sequence values are cached in memory for performance
Identity Columns: The Modern Approach
Identity columns, introduced in PostgreSQL 10, are the recommended way to handle auto-incrementing columns. They provide better standards compliance and clearer semantics.
Basic Identity Column Usage
CREATE TABLE id_example (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name text
);
-- Insert works without specifying id
INSERT INTO id_example (name) VALUES ('John');
-- Attempting manual id insert fails
INSERT INTO id_example (id, name) VALUES (12, 'John'); -- Error
CREATE TABLE id_example (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name text
);
-- Insert works without specifying id
INSERT INTO id_example (name) VALUES ('John');
-- Attempting manual id insert fails
INSERT INTO id_example (id, name) VALUES (12, 'John'); -- Error
Overriding Identity Values
Sometimes you need to insert specific ID values. There are two approaches:
- Using OVERRIDING SYSTEM VALUE:
-- Override system-generated values (use with caution)
-- This will create a row with ID 16, but can cause sequence misalignment
INSERT INTO id_example (id, name)
OVERRIDING SYSTEM VALUE
VALUES (16, 'Abhi');
-- After manual inserts, the sequence continues from its last value
-- This can lead to unique constraint violations when the sequence
-- catches up to manually inserted values
-- To prevent conflicts, always realign the sequence after manual inserts to max id.
-- Override system-generated values (use with caution)
-- This will create a row with ID 16, but can cause sequence misalignment
INSERT INTO id_example (id, name)
OVERRIDING SYSTEM VALUE
VALUES (16, 'Abhi');
-- After manual inserts, the sequence continues from its last value
-- This can lead to unique constraint violations when the sequence
-- catches up to manually inserted values
-- To prevent conflicts, always realign the sequence after manual inserts to max id.
- Using GENERATED BY DEFAULT: A More Flexible Alternative, while GENERATED ALWAYS enforces strict system control over ID generation, GENERATED BY DEFAULT provides more flexibility. It allows manual ID insertion when needed while still automatically generating values when none are specified. This is particularly useful in scenarios like:
- Data migration where you need to preserve existing IDs
- Integration with legacy systems that require specific ID values
- Testing scenarios where predetermined IDs are necessary
However, this flexibility comes with responsibility - you must carefully manage ID assignments to avoid conflicts. Here's how it works:
CREATE TABLE id_example (
id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
name text
);
-- Now manual inserts work without override
INSERT INTO id_example (id, name) VALUES (12, 'John');
CREATE TABLE id_example (
id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
name text
);
-- Now manual inserts work without override
INSERT INTO id_example (id, name) VALUES (12, 'John');
Fixing Sequence Gaps
When manually inserting IDs, you might need to realign the sequence:
-- Get the sequence name
SELECT pg_get_serial_sequence('id_example', 'id');
-- Reset sequence to maximum ID
SELECT setval(
'public.id_example_id_seq',
(SELECT max(id) FROM id_example)
);
-- Get the sequence name
SELECT pg_get_serial_sequence('id_example', 'id');
-- Reset sequence to maximum ID
SELECT setval(
'public.id_example_id_seq',
(SELECT max(id) FROM id_example)
);
Best Practices and Recommendations
When implementing identity management in PostgreSQL, use IDENTITY columns for new tables while maintaining consistency with existing approaches in legacy systems. Always opt for BIGINT data type to future-proof your applications, unless storage constraints dictate otherwise.
Common Pitfalls to Avoid
- Don't Reset Production Sequences:
- Resetting sequences in production can lead to unique constraint violations
- Always use
setval()
with care
- Mind the Gaps:
- Sequence gaps are normal and expected
- Don't rely on sequential numbering for business logic
- Transaction Handling:
- Remember that
nextval()
calls are never rolled back - Plan for gaps in sequences during transaction rollbacks
- Remember that
Conclusion
Identity management in PostgreSQL has evolved from serial types to identity columns, offering improved standards compliance and clearer semantics. While sequences remain the foundational building block, identity columns provide the best approach for new development. Understanding these mechanisms helps in designing robust and scalable database schemas.
Remember:
- Use IDENTITY columns for new development
- Understand sequences for maintenance and troubleshooting
- Consider BIGINT for future-proofing
- Monitor sequence usage in production
- Plan for gaps and concurrent access
🎫
Do you have any questions, or simply wish to contact me privately? Don't hesitate to shoot me a DM on Twitter.
Have a wonderful day.
Abhishek 🙏