Database Administration Portfolio Project

PostgreSQL DBA
End-to-End Project

An end-to-end PostgreSQL database administration project I built using real-world public healthcare data from CMS. I designed the schema, loaded 9.6 million records, diagnosed performance problems, optimized queries by up to 4,136×, implemented a backup and recovery strategy, and secured the database with role-based access control, row-level security, and audit logging.

PostgreSQL 18.2 Schema Design Performance Tuning Backup & Recovery Security & RBAC CMS 2023 Data

Explore the Project Source Code

Project Overview

What This Project Demonstrates

The Goal

I wanted to build a PostgreSQL project that goes beyond writing queries — one that demonstrates the full scope of what a Database Administrator does in a real production environment. Schema design, bulk data loading, performance baselining, index optimization, backup and recovery, monitoring, and security aren't separate skills. They're a discipline, and this project treats them that way.

The Dataset

I used the CMS Medicare Physician & Other Practitioners dataset for 2023 — the same family of public data I worked with in my Medicare pipeline project. The dataset contains one row per provider per procedure code, capturing utilization and payment information for Medicare fee-for-service claims. At 3 GB raw and 9.6 million rows, it's large enough to make performance decisions matter.

9.6M

Service Rows Loaded

1.17M

Unique Providers

5.5 GB

Database Size

Project Phases

Performance Results

Before & After Optimization

I ran the same five queries before and after adding indexes and tuning configuration. The results below are measured execution times from EXPLAIN ANALYZE — real numbers, not estimates.

Query	Description	Before	After	Result
Q1	Provider lookup by NPI	1 ms	0.053 ms	19× faster
Q2	Services for a specific provider	455 ms	0.110 ms	4,136× faster
Q3	Top 10 providers by payments	10,701 ms	8,795 ms	18% faster
Q4	Services by state & provider type	25,190 ms	25,082 ms	Marginal
Q5	High value procedures	9,821 ms	5,527 ms	44% faster

Key Findings

What the Project Actually Shows

4,136×

A single missing index turned a 455ms sequential scan into a 0.11ms index lookup. PostgreSQL was launching 2 parallel workers and scanning 3.2 million rows to return 7 records — because no index existed on rndrng_npi in the services table. One CREATE INDEX statement fixed it.

85%

pg_dump compressed a 5.5 GB database down to 845 MB. The custom format backup restored cleanly with all 9.6M rows and all 7 indexes intact. A backup that has never been tested is not a backup — the restore test is documented in full.

54,351

Row-level security restricted the application user to 54,351 of 1,175,281 providers. The RLS policy limits app_user to Virginia and Maryland only — enforced transparently by the database engine regardless of how the query is written.

Not every query can be dramatically improved — and documenting that honestly matters. The state/provider-type aggregation processes all 9.6M rows with no selective filter. Indexes cannot eliminate the cost of reading every row. Setting work_mem too high actually made this query slower by suppressing parallel execution.

Technologies & Tools

Built with PostgreSQL 18

PostgreSQL 18.2 Core database engine

DataGrip Primary query tool

pg_dump Logical backup

pg_basebackup Physical cluster backup

pg_stat_statements Query statistics

EXPLAIN ANALYZE Execution plan analysis

Row-Level Security Data visibility control

Git & GitHub Version control

PowerShell CLI & automation

VS Code Script authoring

PostgreSQL DBA
End-to-End Project

What This Project Demonstrates

The Goal

The Dataset

Five Phases, One Complete Project

Before & After Optimization

What the Project Actually Shows

Built with PostgreSQL 18

Two Ways In

PostgreSQL DBAEnd-to-End Project

What This Project Demonstrates

The Goal

The Dataset

Five Phases, One Complete Project

Before & After Optimization

What the Project Actually Shows

Built with PostgreSQL 18

Two Ways In

PostgreSQL DBA
End-to-End Project