Case Study
Recursive Hierarchy Builder
PostgreSQL recursive CTE with jsonb_agg — transforms a flat adjacency-list table into a nested JSON tree in a single query, no ORM round-trips.
- PostgreSQL
- Recursive CTE
- jsonb_build_object
- Window Functions
- sql
- postgresql
- cte
- json
- database
- enterprise
Overview
This project demonstrates senior-level relational database expertise. Instead of relying on an application-level ORM (like Prisma or Hibernate) to fetch flat data and sort it via heavy CPU operations in memory, this implementation performs the entire hierarchical compilation natively within the database engine.
It uses a Recursive Common Table Expression (CTE) paired with PostgreSQL’s modern JSON aggregation functions (jsonb_build_object and jsonb_agg) to transform a flat, adjacency-list table into a deeply nested, infinite-depth JSON tree structure in a single query.
Part of the DIY build series: Project 1 · Project 2 · Project 3.
What it implements
- Adjacency-list schema — standard enterprise pattern for org charts, CRM folders, and inventory trees
- Bottom-up recursive CTE — leaf nodes first, then progressive roll-up into parents
- Native JSON assembly —
jsonb_build_object+jsonb_aggwithout application-layer serialization - Window-function deduplication —
ROW_NUMBER() OVER (...)isolates the highest-depth state per node - Indexed foreign key —
parent_idindex to avoid sequential scans during recursion
Project setup & execution
Run this script inside any standard PostgreSQL instance — local Docker, Supabase, or an online workspace like db-fiddle.
Create hierarchy_demolition.sql or paste the entire block into your database terminal:
-- ============================================================================
-- 1. SCHEMA ISOLATION & TABLE CREATION
-- ============================================================================
-- We use a flat "Adjacency List" model, which is the standard enterprise way
-- to store hierarchies (inventories, org charts, or multi-tenant CRM folders).
-- ============================================================================
DROP TABLE IF EXISTS categories CASCADE;
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
name VARCHAR(50) NOT NULL,
parent_id INT REFERENCES categories(id) ON DELETE CASCADE
);
-- Index the foreign key to prevent heavy sequential scans during recursive joins
CREATE INDEX idx_categories_parent_id ON categories(parent_id);
-- ============================================================================
-- 2. MOCK DATA INSERTION (Deeply Nested Hierarchy)
-- ============================================================================
-- Structure:
-- Hardware
-- ├── Components
-- │ └── Processors
-- └── Peripherals
-- └── Input Devices
-- ├── Keyboards
-- └── Mice
-- ============================================================================
INSERT INTO categories (id, name, parent_id) VALUES
(1, 'Hardware', NULL), -- Root Node
(2, 'Components', 1), -- Level 1
(3, 'Peripherals', 1), -- Level 1
(4, 'Processors', 2), -- Level 2
(5, 'Input Devices', 3), -- Level 2
(6, 'Keyboards', 5), -- Level 3
(7, 'Mice', 5); -- Level 3
SELECT setval('categories_id_seq', (SELECT MAX(id) FROM categories));
-- ============================================================================
-- 3. THE MASTER RECURSIVE JSON CTE QUERY
-- ============================================================================
WITH RECURSIVE tree_builder AS (
-- ANCHOR MEMBER: Find the lowest leaf nodes (nodes that are never parents)
SELECT
c.id,
c.name,
c.parent_id,
1 AS max_depth,
jsonb_build_object(
'id', c.id,
'name', c.name,
'children', '[]'::jsonb
) AS json_node
FROM categories c
LEFT JOIN categories child ON c.id = child.parent_id
WHERE child.id IS NULL
UNION ALL
-- RECURSIVE MEMBER: Roll up child JSON objects into parent objects
SELECT
parent.id,
parent.name,
parent.parent_id,
t.max_depth + 1,
jsonb_build_object(
'id', parent.id,
'name', parent.name,
'children', jsonb_agg(t.json_node)
)
FROM categories parent
JOIN tree_builder t ON parent.id = t.parent_id
GROUP BY parent.id, parent.name, parent.parent_id, t.max_depth
),
ranked_nodes AS (
SELECT
id,
parent_id,
json_node,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY max_depth DESC) as rn
FROM tree_builder
)
SELECT jsonb_agg(json_node) AS nested_category_tree
FROM ranked_nodes
WHERE rn = 1 AND parent_id IS NULL;
Expected query output
Running the query yields a single JSON document containing the entire multi-tiered hierarchy:
[
{
"id": 1,
"name": "Hardware",
"children": [
{
"id": 2,
"name": "Components",
"children": [
{
"id": 4,
"name": "Processors",
"children": []
}
]
},
{
"id": 3,
"name": "Peripherals",
"children": [
{
"id": 5,
"name": "Input Devices",
"children": [
{
"id": 6,
"name": "Keyboards",
"children": []
},
{
"id": 7,
"name": "Mice",
"children": []
}
]
}
]
}
]
}
]
Why this shines on a portfolio
- Massive compute optimization — most developers run N+1 recursive network queries back to the database. Doing this inside the engine demonstrates advanced database operations.
- Advanced language utilities — window functions, recursive CTE mechanics, and native JSON type manipulation in one query signal capable database engineering to interviewers and high-value clients.
- CRM-relevant pattern — same adjacency-list model used in Gnomad CRM multi-tenant folder and category structures.
Related work
- Project 3: Legacy Flat-File Bridge — enterprise data transformation in Java
- Gnomad CRM — multi-tenant architecture where hierarchical queries matter
- Project 1: Markdown API Router — JSON API layer that could consume this tree output directly
- Project 5: Concurrent Link Sweeper — validate links across the content this query serves
- Project 6: Local Context Server — agent tool gateway for homelab monitoring
- Project 7: Multi-Tenant Migration Engine — fleet-wide schema migrations for tenant isolation