👋
Welcome to my blog!

PostgreSQL Deep Dive Part 15 — EXPLAIN - Understanding Query Execution Plans

Query performance optimization is a critical aspect of database management, and PostgreSQL provides powerful tools to understand how queries are executed. We'll explore PostgreSQL's EXPLAIN command, its output structure, various scan methods, and how to interpret performance metrics.

PostgreSQL Deep Dive Part 15 — EXPLAIN - Understanding Query Execution Plans
Database
Postgres
Development

Published At

1/18/2025

Reading Time

~ 5 min read

📚 Raw Postgres Study Notes

Access complete, unfiltered study notes maintained on GitHub for easily maintaing Second Brain Dir

Understanding EXPLAIN Command Structure

The EXPLAIN command in PostgreSQL reveals the execution plan that the PostgreSQL planner generates for a given query. Let's start with basic examples and progressively analyze more complex scenarios:

sql
EXPLAIN SELECT * FROM users;
sql
EXPLAIN SELECT * FROM users;

This basic query will produce an output containing several key components:

  • The scan method chosen
  • Estimated startup and total cost
  • Estimated number of rows
  • Estimated average width of each row

Let's look at how the plan changes with different query conditions:

sql
EXPLAIN SELECT * FROM users LIMIT 10;
 
EXPLAIN SELECT * FROM users
WHERE first_name = 'Abhi'
LIMIT 10;
sql
EXPLAIN SELECT * FROM users LIMIT 10;
 
EXPLAIN SELECT * FROM users
WHERE first_name = 'Abhi'
LIMIT 10;

The execution plan shows how PostgreSQL decides to retrieve the data, whether through sequential scans, index scans, or other methods. Each node in the plan represents an operation, with costs and row estimates provided for each step.

Understanding Different Scan Methods

PostgreSQL employs various scanning methods to retrieve data efficiently. Let's explore each type in detail:

1. Index Only Scan (Most Efficient)

This is the optimal scanning method where PostgreSQL can satisfy the query using only the information stored in the index, without accessing the table heap.

When it occurs:

  • All required columns are present in the index
  • No additional columns need to be fetched from the table
  • The WHERE clause references only indexed columns

Example scenario:

sql
CREATE INDEX idx_email ON users(email);
EXPLAIN SELECT email FROM users WHERE email = 'user@example.com';
sql
CREATE INDEX idx_email ON users(email);
EXPLAIN SELECT email FROM users WHERE email = 'user@example.com';

2. Index Scan

A standard index scan occurs when PostgreSQL uses an index to locate specific rows but needs to fetch additional columns from the table.

Characteristics:

  • Uses the index to find relevant rows
  • Requires additional heap fetches for non-indexed columns
  • Generally efficient for selective queries

3. Bitmap Index Scan

This method is used when PostgreSQL needs to access a larger number of rows but not the entire table.

Process:

  1. Scans the index and creates a bitmap in memory
  2. Sorts the heap accesses in physical order
  3. Retrieves the actual rows

This method is particularly efficient when:

  • Multiple indexes need to be combined
  • A moderate number of rows need to be retrieved
  • The rows are scattered across the table

4. Sequential Scan

A sequential scan reads the entire table in physical order.

When it's used:

  • No suitable indexes are available
  • A large portion of the table needs to be read
  • The table is small enough that sequential scanning is faster than index usage

Understanding Cost and Row Metrics

PostgreSQL's EXPLAIN provides several key metrics to understand query performance:

1. Startup Cost

  • Initial cost before the first row can be returned
  • Includes index traversal time
  • Important for queries that need to sort or aggregate data

2. Total Cost

  • Estimated total cost to complete the query
  • Includes all I/O and processing costs
  • Used by the planner to compare different execution strategies

3. Rows

  • Estimated number of rows that will be output
  • Based on table statistics
  • Critical for understanding query selectivity

4. Width

  • Estimated average width of rows in bytes
  • Helps understand memory usage
  • Important for optimizing data retrieval

EXPLAIN ANALYZE: Getting Real Execution Statistics

While EXPLAIN shows estimated costs and row counts, EXPLAIN ANALYZE actually executes the query and provides real timing and row counts.

sql
EXPLAIN ANALYZE SELECT * FROM users WHERE email < 'b';
sql
EXPLAIN ANALYZE SELECT * FROM users WHERE email < 'b';

Important considerations:

  • EXPLAIN ANALYZE actually executes the query
  • Use with caution on UPDATE, DELETE, or INSERT statements
  • Provides actual timing and row counts
  • Shows buffer usage and I/O statistics

The output includes:

  • Actual time spent at each node
  • Actual number of rows processed
  • Number of loops executed
  • Memory usage statistics

Example output analysis:

txt
Seq Scan on users  (cost=0.00..1638.00 rows=50000 width=90)
  (actual time=0.016..8.571 rows=49999 loops=1)
  Filter: (email < 'b'::text)
  Rows Removed by Filter: 1
Planning Time: 0.083 ms
Execution Time: 11.752 ms
txt
Seq Scan on users  (cost=0.00..1638.00 rows=50000 width=90)
  (actual time=0.016..8.571 rows=49999 loops=1)
  Filter: (email < 'b'::text)
  Rows Removed by Filter: 1
Planning Time: 0.083 ms
Execution Time: 11.752 ms

This shows:

  • Estimated vs actual row counts
  • Time spent planning vs executing
  • Filter effectiveness
  • Actual execution statistics

Understanding these metrics helps in:

  • Identifying query bottlenecks
  • Optimizing index usage
  • Improving query performance
  • Making informed decisions about database design

The EXPLAIN command and its various components provide invaluable insights into query execution and performance. By understanding these elements, developers and database administrators can make informed decisions about query optimization, index creation, and database design.

Remember that while EXPLAIN provides estimates based on statistics, EXPLAIN ANALYZE gives actual execution metrics but comes with the cost of actually running the query. Use these tools judiciously based on your specific needs and circumstances.

🏃‍♂️‍➡️

Do you have any questions, or simply wish to contact me privately? Don't hesitate to shoot me a DM on Twitter.

Have a wonderful day.
Abhishek 🙏

Join My Exclusive Newsletter Community

Step into a world where creativity intersects with technology. By subscribing, you'll get a front-row seat to my latest musings, full-stack development resources, and exclusive previews of future posts. Each email is a crafted experience that includes:

  • In-depth looks at my covert projects and musings to ignite your imagination.
  • Handpicked frontend development resources and current explorations, aimed at expanding your developer toolkit.
  • A monthly infusion of inspiration with my personal selection of quotes, books, and music.

Embrace the confluence of words and wonder, curated thoughtfully and sent straight to your inbox.

No fluff. Just the highest caliber of ideas.