Overview Phase 1 — Schema & Load Phase 2 — Diagnostics Phase 3 — Optimization Phase 4 — Operations Phase 5 — Security
Phase 02 — Baseline Measurement & Diagnostics

Measuring Before Optimizing

Before touching a single configuration setting or adding a single index, I measured everything. Table sizes, index inventory, cache hit ratios, sequential scan patterns, and execution plans for five representative queries. These numbers are the baseline that every Phase 3 improvement gets measured against.

pg_stat_statements EXPLAIN ANALYZE Sequential Scan Analysis Cache Hit Ratios Index Inventory

Database & Table Sizes

What I Measured First

The first thing I did was establish storage baselines — total database size, individual table sizes, and the index inventory. This tells me what I'm working with before any optimization decisions are made.

Table Total Size Table Size Index Size
staging_raw3,305 MB3,304 MB0 bytes
provider_services1,772 MB1,564 MB207 MB
providers298 MB263 MB35 MB

The key observation here: both production tables have only their primary key indexes. No analytical indexes exist on any filter, join, or group-by columns. Every analytical query is going to do a full sequential scan.

Screenshot — Index inventory before optimization
Index inventory showing only two primary key indexes exist
Only two indexes exist — both primary keys. No analytical indexes anywhere in the cms schema.

Sequential Scan Analysis

The Numbers Don't Lie

I queried pg_stat_user_tables to see how many sequential scans each table had accumulated and how many rows those scans read. The results confirmed the missing index problem.

Screenshot — Sequential scan counts per table
Sequential scan analysis showing provider_services with 106 million rows read by sequential scans
provider_services had 20 sequential scans reading over 106 million rows — a clear signal of missing indexes
106M
Rows read by sequential scan on provider_services. 20 sequential scans reading 106 million rows with only 2 index scans. Every analytical query was doing a full table scan through 9.6 million rows.
88.2%
Cache hit ratio on provider_services. Acceptable but 2.6 million disk reads is significant. The providers table was at 98.9% — almost entirely served from memory.

Five Queries, Unoptimized

Measuring the Problem

I ran five representative analytical queries using EXPLAIN ANALYZE to capture actual execution plans and timings. I also enabled pg_stat_statements to track cumulative query statistics across sessions.

Screenshot — pg_stat_statements baseline results
pg_stat_statements showing slowest queries before optimization
pg_stat_statements capturing the five baseline queries — Q4 at 25,190ms is the worst performer
Query 1
Provider lookup by NPI
1 ms
Fast — hits the primary key index on NPI. The only query currently supported by an index.
Query 2
Services for a specific provider
455 ms
Full parallel sequential scan — 2 workers scanning 3.2M rows to return 7 records. No index on rndrng_npi in provider_services.
Query 3
Top 10 providers by payments
10,701 ms
Full sequential scans on both tables. Hash aggregation spilling 228–257 MB per worker to disk.
Query 4
Services by state & provider type
25,190 ms
The slowest query. External merge sorts spilling 149–163 MB per worker to disk. All 9.6M rows processed.
Query 5
High value procedures
9,821 ms
Full sequential scan with external merge sorts spilling 252–295 MB per worker to disk for the HCPCS aggregation.

Root Causes Identified

What the Diagnostics Tell Me

The baseline measurements tell a clear story. Two root causes are responsible for all the performance problems I measured. Both are addressable — and both will be fixed in Phase 3.

Root 1
Missing indexes on filter, join, and group-by columns. No index exists on rndrng_npi in provider_services, on state_abbr, on provider_type, or on hcpcs_cd. Every analytical query is doing a full sequential scan regardless of how selective the filter is.
Root 2
Insufficient work_mem causing disk spill. The default 4MB work_mem is far too low for aggregations over 9.6 million rows. Sort and hash operations are spilling up to 307 MB per worker to disk — turning in-memory operations into disk I/O operations.

These two findings directly inform the optimization strategy in Phase 3 — targeted indexes first, then memory configuration tuning.