Phase 02 — Baseline Measurement & Diagnostics

Measuring Before Optimizing

Before touching a single configuration setting or adding a single index, I measured everything. Table sizes, index inventory, cache hit ratios, sequential scan patterns, and execution plans for five representative queries. These numbers are the baseline that every Phase 3 improvement gets measured against.

pg_stat_statements EXPLAIN ANALYZE Sequential Scan Analysis Cache Hit Ratios Index Inventory

Storage Baseline

Database & Table Sizes

What I Measured First

The first thing I did was establish storage baselines — total database size, individual table sizes, and the index inventory. This tells me what I'm working with before any optimization decisions are made.

Table	Total Size	Table Size	Index Size
staging_raw	3,305 MB	3,304 MB	0 bytes
provider_services	1,772 MB	1,564 MB	207 MB
providers	298 MB	263 MB	35 MB

The key observation here: both production tables have only their primary key indexes. No analytical indexes exist on any filter, join, or group-by columns. Every analytical query is going to do a full sequential scan.

Screenshot — Index inventory before optimization

Index inventory showing only two primary key indexes exist

Only two indexes exist — both primary keys. No analytical indexes anywhere in the cms schema.

Diagnostics

Sequential Scan Analysis

The Numbers Don't Lie

I queried pg_stat_user_tables to see how many sequential scans each table had accumulated and how many rows those scans read. The results confirmed the missing index problem.

Screenshot — Sequential scan counts per table

provider_services had 20 sequential scans reading over 106 million rows — a clear signal of missing indexes

106M

Rows read by sequential scan on provider_services. 20 sequential scans reading 106 million rows with only 2 index scans. Every analytical query was doing a full table scan through 9.6 million rows.

88.2%

Cache hit ratio on provider_services. Acceptable but 2.6 million disk reads is significant. The providers table was at 98.9% — almost entirely served from memory.

Baseline Query Performance

Five Queries, Unoptimized

Measuring the Problem

I ran five representative analytical queries using EXPLAIN ANALYZE to capture actual execution plans and timings. I also enabled pg_stat_statements to track cumulative query statistics across sessions.

Screenshot — pg_stat_statements baseline results

pg_stat_statements showing slowest queries before optimization

pg_stat_statements capturing the five baseline queries — Q4 at 25,190ms is the worst performer

Query 1

Provider lookup by NPI

1 ms

Fast — hits the primary key index on NPI. The only query currently supported by an index.

Query 2

Services for a specific provider

455 ms

Full parallel sequential scan — 2 workers scanning 3.2M rows to return 7 records. No index on rndrng_npi in provider_services.

Query 3

Top 10 providers by payments

10,701 ms

Full sequential scans on both tables. Hash aggregation spilling 228–257 MB per worker to disk.

Query 4

Services by state & provider type

25,190 ms

The slowest query. External merge sorts spilling 149–163 MB per worker to disk. All 9.6M rows processed.

Query 5

High value procedures

9,821 ms

Full sequential scan with external merge sorts spilling 252–295 MB per worker to disk for the HCPCS aggregation.

Diagnostic Summary

Root Causes Identified

What the Diagnostics Tell Me

The baseline measurements tell a clear story. Two root causes are responsible for all the performance problems I measured. Both are addressable — and both will be fixed in Phase 3.

Root 1

Missing indexes on filter, join, and group-by columns. No index exists on rndrng_npi in provider_services, on state_abbr, on provider_type, or on hcpcs_cd. Every analytical query is doing a full sequential scan regardless of how selective the filter is.

Root 2

Insufficient work_mem causing disk spill. The default 4MB work_mem is far too low for aggregations over 9.6 million rows. Sort and hash operations are spilling up to 307 MB per worker to disk — turning in-memory operations into disk I/O operations.

These two findings directly inform the optimization strategy in Phase 3 — targeted indexes first, then memory configuration tuning.

Phase 1 Phase 3 — Optimization