PostgreSQL Deep Dive Part 14 — Understanding Index Storage — Files, Structure, and Management
Ever wondered what actually happens when you create an index in PostgreSQL? While we often think about indexes abstractly as "making queries faster," there's a fascinating world of file management, storage structures, and system catalogs working behind the scenes. Let's dive deep into how PostgreSQL manages index storage at the filesystem level.
Published At
1/6/2025
Reading Time
~ 2 min read
Access complete, unfiltered study notes maintained on GitHub for easily maintaing Second Brain Dir
The Anatomy of Index Storage 🗄️
When you create an index in PostgreSQL, you're not just making an entry in a system table — you're initiating a complex series of filesystem operations. Let's break this down step by step.
File Location and Structure
-- First, let's see where PostgreSQL stores its files
SHOW data_directory;
-- Create a sample table and index
CREATE TABLE products (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT,
price NUMERIC
);
CREATE INDEX idx_products_name ON products(name);
-- Find the physical location of our index
SELECT pg_relation_filepath('idx_products_name');
-- First, let's see where PostgreSQL stores its files
SHOW data_directory;
-- Create a sample table and index
CREATE TABLE products (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT,
price NUMERIC
);
CREATE INDEX idx_products_name ON products(name);
-- Find the physical location of our index
SELECT pg_relation_filepath('idx_products_name');
The storage hierarchy looks like this:
$PGDATA/
└── base/
└── $DATABASE_OID/
├── $TABLE_FILENODE
├── $INDEX1_FILENODE
└── $INDEX2_FILENODE
$PGDATA/
└── base/
└── $DATABASE_OID/
├── $TABLE_FILENODE
├── $INDEX1_FILENODE
└── $INDEX2_FILENODE
File Management and Object IDs 🔍
PostgreSQL assigns unique Object IDs (OIDs) to manage these files. Let's explore this system:
-- Examining object IDs and file relationships
SELECT
c.relname as object_name,
c.relkind as object_type,
c.relfilenode as file_node,
c.oid as object_id,
pg_relation_filepath(c.oid) as file_path,
pg_size_pretty(pg_relation_size(c.oid)) as file_size
FROM pg_class c
WHERE c.relname IN ('products', 'idx_products_name');
-- Examining object IDs and file relationships
SELECT
c.relname as object_name,
c.relkind as object_type,
c.relfilenode as file_node,
c.oid as object_id,
pg_relation_filepath(c.oid) as file_path,
pg_size_pretty(pg_relation_size(c.oid)) as file_size
FROM pg_class c
WHERE c.relname IN ('products', 'idx_products_name');
Understanding the output:
relname
: Object name (table or index)relkind
: Object type ('r' for table, 'i' for index)relfilenode
: Actual filename on diskoid
: Object identifier in the system catalog
This deep understanding of PostgreSQL's index storage mechanisms is crucial and can help you in many ways.
Remember: Every index you create has real storage implications. Understanding these helps make better decisions about index creation and maintenance strategies.
❖
Do you have any questions, or simply wish to contact me privately? Don't hesitate to shoot me a DM on Twitter.
Have a wonderful day.
Abhishek 🙏