# PostgreSQL Architect's Handbook

> A developer's code-time reference for PostgreSQL + Prisma on this boilerplate.
> Every section follows: **one sentence why → the pattern → code snippet**. Scan what you need.

**How to read this:**
- You're about to design a table → jump to **Part I** (mindset) and **Part III** (design patterns).
- You're writing a query → **Part IV** (query patterns) and **Part V** (performance).
- You're adding multi-tenancy or custom fields → **Part VI**.
- You're about to deploy a migration → **Part VII**.
- You're stuck on a tradeoff → **Part VIII** (decision trees).

---

## Table of contents

- [Part I — Mindset](#part-i--mindset)
- [Part II — PostgreSQL essentials](#part-ii--postgresql-essentials)
- [Part III — Design patterns (this boilerplate's conventions)](#part-iii--design-patterns-this-boilerplates-conventions)
- [Part IV — Query patterns](#part-iv--query-patterns)
- [Part V — Performance playbook](#part-v--performance-playbook)
- [Part VI — Multi-tenant + RBAC](#part-vi--multi-tenant--rbac)
- [Part VII — Production](#part-vii--production)
- [Part VIII — Decision trees](#part-viii--decision-trees)

---

## Part I — Mindset

### The 10 Designer Questions

**Why:** If you can answer these before touching `schema.prisma`, you won't ship a schema that needs rewriting in 6 months.

1. What data am I storing and why?
2. Who reads this data and how often?
3. Who writes this data and how often?
4. What are the relationships between entities?
5. What happens if this data is wrong?
6. How big will this grow in 1 year? 5 years?
7. What queries will run most often?
8. What data can never be lost?
9. Who should and should not see this data?
10. What business rules must the database enforce?

### The 10 Architect Rules (carry these forever)

**Why:** These are invariants. Violating any of them is a design smell worth a second look.

1. **Design before you code.** One hour of schema design prevents ten hours of production fixes.
2. **Every constraint protects the business.** `NOT NULL`, `CHECK`, `FK`, `UNIQUE` — don't skip them for convenience.
3. **Index for queries, not for tables.** Look at what queries you run; index what they filter on.
4. **Money lives in a transaction.** No exceptions, no "we'll be careful."
5. **A backup untested is not a backup.** It's optimism.
6. **Measure before you optimize.** `EXPLAIN ANALYZE` before any index, rewrite, or restructure.
7. **The database is the source of truth.** Apps have bugs. APIs get bypassed. The DB must enforce its own rules.
8. **`company_id` in every multi-tenant query is not optional.** One missing `WHERE` leaks everyone's data.
9. **Locks are short. Transactions are fast.** Long transactions block other users.
10. **The schema tells the story.** If it doesn't read like the business, you haven't modeled it yet.

---

## Part II — PostgreSQL essentials

### Data types — use these defaults

**Why:** The wrong column type causes silent bugs months later (timezone drift, precision loss, index skip).

| Use case | Type | Notes |
|---|---|---|
| Primary key (default) | `BIGSERIAL` / `BigInt @id @default(autoincrement())` | 64-bit, safe past 2 billion rows |
| Primary key (distributed / exposed in URLs) | `UUID` / `String @id @default(uuid())` | Use when IDs cross services or appear in public URLs |
| Money | `DECIMAL(p,s)` / `@db.Decimal(15,2)` | **Never `FLOAT` or `REAL`** — precision loss on arithmetic |
| Timestamps | `TIMESTAMPTZ` (Prisma `DateTime`) | Stores UTC, ignores any client-sent offset. **Never `TIMESTAMP` without TZ** |
| Calendar date (birthday, due date) | `DATE` / `@db.Date` | No time component; read Part IV on `@db.Date` pitfalls |
| Text (no cap needed) | `TEXT` | Use over `VARCHAR(n)` unless the cap is a business rule |
| Boolean | `BOOLEAN` | Not `SMALLINT` or `CHAR(1)` |
| JSON data | `JSONB` | **Never `JSON`** — no indexing support on plain JSON |
| IP address | `INET` | Validates format, supports subnet operators |
| Fixed enumeration | `ENUM` or `CHECK (x IN (...))` | Prefer `CHECK` for values that may evolve |

**Prisma example (this boilerplate's User model):**

```prisma
model User {
  id            BigInt    @id @default(autoincrement())
  email         String    @unique
  password_hash String
  is_active     Boolean   @default(true)
  avatar_key    String?                                // env-agnostic S3 key
  last_login_at DateTime?                              // TIMESTAMPTZ
  created_at    DateTime  @default(now())
  updated_at    DateTime  @updatedAt
  deleted_at    DateTime?

  @@index([deleted_at])
  @@map("users")
}
```

### Index types — decision table

**Why:** The wrong index type either doesn't help or makes writes slower for no benefit.

| Index | When to use | When NOT to use |
|---|---|---|
| **B-Tree** (default) | Equality (`=`), ranges (`<`, `>`, `BETWEEN`), sorting, `LIKE 'prefix%'` | Full-text search, JSONB keys, arrays |
| **GIN** | `JSONB`, `ARRAY`, full-text search, trigram (`pg_trgm`) | Pure equality on scalar columns — B-Tree is faster |
| **GiST** | Ranges (`daterange`, `tsrange`), geometric data, overlap queries | Standard equality/range on scalars |
| **BRIN** | Very large append-only tables (> 100M rows) where values correlate with physical order (e.g., `created_at`) | Small tables, unordered data |
| **Hash** | Pure equality only (no ranges, no sorting) | Almost never — B-Tree is comparable and more flexible |
| **Partial** | Index a subset (e.g., only `status = 'pending'`) | When you query across all values equally |
| **Covering** (`INCLUDE`) | High-frequency queries where the index can serve the whole row | Tables where writes dominate reads |

```sql
-- Defaults (B-Tree)
CREATE INDEX idx_users_email ON users(email);

-- JSONB — GIN on the whole document
CREATE INDEX idx_products_attrs ON products USING GIN (attributes);

-- JSONB — expression index on one key (faster for that key specifically)
CREATE INDEX idx_products_color ON products ((attributes->>'color'));

-- Partial — index only the rows you query most
CREATE INDEX idx_orders_pending ON orders(created_at) WHERE status = 'pending';

-- Composite — leftmost column must be in WHERE
CREATE INDEX idx_orders_status_date ON orders(status, order_date);

-- Covering — include extra columns so the table isn't read
CREATE INDEX idx_users_email_cover ON users(email) INCLUDE (full_name, is_active);

-- BRIN — massive log table
CREATE INDEX idx_audit_created_brin ON audit_log USING BRIN (created_at);
```

### JSONB patterns

**Why:** JSONB is how you escape schema rigidity without dropping into NoSQL. But queries against it need the right operator + index or they silently scan the whole table.

**Operators:**

| Operator | Returns | Example |
|---|---|---|
| `->` | JSONB (nested object) | `attributes->'color'` → `"D"` (as JSONB) |
| `->>` | TEXT (cast from scalar) | `attributes->>'color'` → `D` (as text) |
| `@>` | Contains (can use GIN) | `attributes @> '{"color":"D"}'` |
| `?` | Key exists | `attributes ? 'cert'` |
| `#>` / `#>>` | Deep path | `data#>>'{addr,city}'` |

**Prisma:**

```ts
// Write
await prisma.product.update({
  where: { id },
  data: {
    attributes: { color: "D", clarity: "VVS1", cert: "GIA" },
  },
});

// Read — Prisma's JSON filter uses @> under the hood
await prisma.product.findMany({
  where: {
    attributes: { path: ["color"], equals: "D" },
  },
});
```

**Index choice:**
- Equality on **one specific key** only → expression index: `CREATE INDEX ... ON products ((attributes->>'color'))`
- Containment queries (`@>`) across many keys → GIN: `CREATE INDEX ... ON products USING GIN (attributes)`

**When to move a JSONB key out to a column:** the moment you filter or sort by it in > 2 places — it's a first-class attribute, not metadata.

### Constraints — the database is the guardian

**Why:** Apps have bugs, APIs get bypassed, direct DB access happens. The constraint catches it regardless.

```prisma
model Diamond {
  id             BigInt   @id @default(autoincrement())
  lot_number     String   @unique                                    // UNIQUE
  weight_carat   Decimal  @db.Decimal(8, 3)                          // precision
  color          String                                              // required (NOT NULL)
  purchase_price Decimal  @db.Decimal(15, 2)
  status         String   @default("in_stock")

  @@map("diamonds")
}
```

```sql
-- CHECK constraints (add via raw migration — Prisma supports them via raw SQL)
ALTER TABLE diamonds
  ADD CONSTRAINT chk_weight_positive CHECK (weight_carat > 0),
  ADD CONSTRAINT chk_color_valid    CHECK (color IN ('D','E','F','G','H','I','J')),
  ADD CONSTRAINT chk_status_valid   CHECK (status IN ('in_stock','on_memo','sold','returned'));
```

**Rule of thumb:**
- `NOT NULL` — always add for required fields. Optional fields stay nullable.
- `UNIQUE` — every field where duplicates are a business bug (email, invoice number, license plate).
- `CHECK` — status enums, positive-only numbers, bounded ranges.
- `FK` — every reference to another table, with a deliberate `ON DELETE` choice (see Part III).

### Transactions + isolation levels

**Why:** Multi-step writes must be atomic or you lose money, orphan orders, double-charge cards.

**Prisma `$transaction()`:**

```ts
// Sequential (each statement in order; all or nothing)
await prisma.$transaction(async (tx) => {
  await tx.account.update({ where: { id: from }, data: { balance: { decrement: 100 } } });
  await tx.account.update({ where: { id: to   }, data: { balance: { increment: 100 } } });
  await tx.transfer.create({ data: { from_id: from, to_id: to, amount: 100 } });
});

// Parallel array (same all-or-nothing, but statements run concurrently)
await prisma.$transaction([
  prisma.userRole.deleteMany({ where: { user_id } }),
  prisma.userRole.create({ data: { user_id, role_id } }),
]);
```

**Isolation levels (PostgreSQL default: `READ COMMITTED`):**

| Level | Use when | Risk |
|---|---|---|
| `READ COMMITTED` | Default — most business reads/writes | Non-repeatable reads possible |
| `REPEATABLE READ` | Multi-step reads that must see a consistent snapshot | Slight concurrency cost |
| `SERIALIZABLE` | Money transfers, inventory decrements, anything where double-spending would be catastrophic | Lowest concurrency — plan for retries |

```ts
await prisma.$transaction(
  async (tx) => { /* ... */ },
  { isolationLevel: Prisma.TransactionIsolationLevel.Serializable },
);
```

### Locks + deadlock prevention

**Why:** Two transactions grabbing the same rows in opposite orders = deadlock → one dies with an error → your API 500s if you don't retry.

**Locking row(s) you intend to update:**

```sql
BEGIN;
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;
```

```ts
// Prisma
await prisma.$queryRaw`SELECT balance FROM accounts WHERE id = ${id} FOR UPDATE`;
```

**Deadlock prevention rule:** Always lock rows **in a deterministic order** across every transaction that touches them.

```ts
// Always lock the lower ID first — never lock in "source then destination" order
const [first, second] = from < to ? [from, to] : [to, from];
await prisma.$transaction(async (tx) => {
  await tx.$queryRaw`SELECT id FROM accounts WHERE id = ${first}  FOR UPDATE`;
  await tx.$queryRaw`SELECT id FROM accounts WHERE id = ${second} FOR UPDATE`;
  // ... then do the transfer
});
```

**Golden rule:** transactions are **short**. Acquire the lock, do the work, commit. Long transactions hold locks → other users wait → cascading slowdowns.

### Views vs materialized views

**Why:** Views abstract complex queries; materialized views cache expensive ones at the cost of staleness.

| Use | View | Materialized view |
|---|---|---|
| Performance | Same as underlying query | Pre-computed on disk — fast |
| Data freshness | Always live | Stale until refreshed |
| Refresh cost | N/A | Full table rebuild (use `CONCURRENTLY`) |
| When to use | Security (expose subset), simplify complex JOINs across codebase | Dashboards, reports where a few minutes stale is fine |
| When NOT to use | Performance-critical large aggregations | Real-time accuracy (balances, live inventory) |

```sql
-- View (query runs every time)
CREATE VIEW active_users AS
  SELECT id, email, full_name FROM users
  WHERE deleted_at IS NULL AND is_active = true;

-- Materialized view (cached; refresh on schedule)
CREATE MATERIALIZED VIEW monthly_revenue AS
  SELECT date_trunc('month', order_date) AS month,
         SUM(amount) AS revenue,
         COUNT(*)    AS orders
  FROM orders
  GROUP BY month;

-- Refresh without blocking reads (requires a UNIQUE index on the mat view)
CREATE UNIQUE INDEX idx_mv_monthly_month ON monthly_revenue(month);
REFRESH MATERIALIZED VIEW CONCURRENTLY monthly_revenue;
```

---

## Part III — Design patterns (this boilerplate's conventions)

### Primary key default: `BigInt`

**Why:** 64-bit integers are fast, compact in indexes, and safe well past any realistic row count. UUIDs add 4× storage and slow down joins.

```prisma
model User {
  id BigInt @id @default(autoincrement())
  // ...
}
```

**Switch to UUID when:** IDs cross service boundaries (public API, distributed write sources, ID-in-URL for privacy).

```prisma
model ExternalRequest {
  id         String   @id @default(uuid()) @db.Uuid
  received_from String
  // ...
}
```

### Audit block — on every mutable model

**Why:** You'll want to know "when was this row last changed?" on every table, eventually. Add it upfront.

```prisma
model Anything {
  // ... fields ...
  created_at DateTime  @default(now())
  updated_at DateTime  @updatedAt
  deleted_at DateTime?

  @@index([deleted_at])
}
```

**Read the index as:** "every list query in this table filters by `deleted_at IS NULL`, so make sure that filter is fast."

### Soft-delete — enforced in every list query

**Why:** If you ever need to "undelete," audit, or restore, soft-delete is the difference between a 2-minute fix and a DB restore from backup.

```ts
// EVERY findMany/count/aggregate — include this
const users = await prisma.user.findMany({
  where: { deleted_at: null, /* other filters */ },
});
```

**Delete = set the timestamp:**

```ts
await prisma.user.update({
  where: { id },
  data: { deleted_at: new Date(), is_active: false },
});
```

**Actually-delete only when:** legal requirement (GDPR right to erasure), or the row is junk (failed background job row, expired token).

### Foreign key `ON DELETE` — decision table

**Why:** The wrong choice either silently loses data or blocks a legitimate user action.

| Relationship | Choice | Example |
|---|---|---|
| Child is owned by parent; dies with parent | `Cascade` | `UserRole` rows when a `User` is deleted |
| Parent is a lookup / reference | `Restrict` | `Order.customer_id` — never delete a customer with orders |
| Child should survive parent, with the ref becoming null | `SetNull` | `Task.assigned_to` when a user is deleted (task stays; assignee becomes null) |

```prisma
model UserRole {
  user_id BigInt
  role_id BigInt

  user User @relation(fields: [user_id], references: [id], onDelete: Cascade)
  role Role @relation(fields: [role_id], references: [id], onDelete: Restrict)

  @@unique([user_id, role_id])
  @@map("user_roles")
}
```

**Financial data = always `Restrict`.** You never accidentally delete a customer with real transactions.

### Naming conventions (reinforcing CLAUDE.md §5)

**Why:** Consistency makes every query readable in 2 years, by anyone on the team.

| Element | Convention | Example |
|---|---|---|
| DB table | `snake_case_plural` | `user_roles`, `order_items` |
| DB column | `snake_case` | `created_at`, `purchase_price` |
| Prisma model | `PascalCase` + `@@map("snake_case_table")` | `UserRole` → `@@map("user_roles")` |
| Prisma field | `camelCase` with `@map("snake_case")` **when they differ** (prefer matching exactly) | `userId Int @map("user_id")` OR just `user_id Int` |
| Primary key | `id` | `id BigInt @id` |
| Foreign key | `{referenced_table_singular}_id` | `customer_id`, `role_id` |
| Timestamps | `created_at`, `updated_at`, `deleted_at` | (suffix `_at`) |
| Booleans | `is_` or `has_` prefix | `is_active`, `has_discount` |

**This boilerplate uses snake_case for Prisma fields too** (not camelCase + `@map`) — fewer mappings, simpler mental model. See existing models in `prisma/schema.prisma`.

### Normalize first, denormalize with evidence

**Why:** Normalization prevents update anomalies. Denormalization is a performance treatment, not a design style.

**Rule:**
1. Start fully normalized (3NF). Every attribute depends only on the primary key.
2. Ship it. Measure.
3. If reads are slow, **add an index first**.
4. If still slow and you've measured the JOIN cost is the real issue, then denormalize (store a snapshot, add a derived column, or introduce a materialized view).
5. Document the denormalization and the staleness budget.

**Never denormalize "just in case" before measuring.**

---

## Part IV — Query patterns

### `SELECT *` is banned in production

**Why:** More data over the wire, breaks if a column is added, prevents Index-Only Scans.

```ts
// BAD
await prisma.user.findMany();  // selects everything

// GOOD — explicit select
await prisma.user.findMany({
  select: { id: true, email: true, full_name: true },
});
```

### UPDATE / DELETE safety — `SELECT` first

**Why:** One missing `WHERE` and you've updated every row. This is the most common "I broke production" story.

```sql
-- Step 1: confirm scope
SELECT id, email FROM users WHERE email LIKE '%test%';

-- Step 2: if the set looks right, then
DELETE FROM users WHERE email LIKE '%test%';
```

**In code:** Prisma's `deleteMany` / `updateMany` require an explicit `where`. Don't defeat it by passing an empty object.

### Pagination — keyset > OFFSET past page ~100

**Why:** `OFFSET 10000` reads and discards 10,000 rows every time.

**OFFSET pagination** (fine for < 100 pages):

```ts
await prisma.user.findMany({
  where: { deleted_at: null },
  orderBy: { id: "asc" },
  skip: (page - 1) * pageSize,
  take: pageSize,
});
```

**Keyset pagination** (constant-time regardless of depth):

```ts
// First page
const first = await prisma.user.findMany({
  where: { deleted_at: null },
  orderBy: { id: "asc" },
  take: 20,
});

// Next page — pass the last id from the previous page
const nextCursor = first.at(-1)?.id;
const next = await prisma.user.findMany({
  where: { deleted_at: null },
  orderBy: { id: "asc" },
  cursor: { id: nextCursor },
  skip: 1,         // skip the cursor row itself
  take: 20,
});
```

**Tradeoff:** keyset can't jump to page 500 directly — only forward/backward. For infinite scroll / APIs, this is actually the correct UX.

### JOIN indexing rule

**Why:** Unindexed foreign keys = sequential scans on every JOIN = linear slowdown.

**Rule:** Every FK column gets an index. Always. No exceptions.

```prisma
model Order {
  id          BigInt @id @default(autoincrement())
  customer_id BigInt
  customer    Customer @relation(fields: [customer_id], references: [id])

  @@index([customer_id])    // <-- always
}
```

### `WHERE` vs `HAVING` — the most-confused rule

**Why:** `WHERE` filters rows; `HAVING` filters groups. Using one where the other belongs changes the result — or silently hurts performance.

```sql
-- WHERE filters BEFORE grouping (uses indexes)
-- HAVING filters AFTER grouping (works on aggregate results)
SELECT customer_id, COUNT(*) AS total_orders
FROM orders
WHERE status = 'completed'     -- row filter BEFORE GROUP BY
GROUP BY customer_id
HAVING COUNT(*) > 5;           -- group filter AFTER
```

**Rule:** If the filter is on a raw column, use `WHERE`. If it's on an aggregate (`COUNT`, `SUM`, etc.), use `HAVING`.

### Window functions — three patterns

**Why:** GROUP BY collapses rows. Window functions keep every row and add an aggregate context alongside.

```sql
-- 1. Rank within a group
SELECT
  customer_id,
  amount,
  RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) AS rank_in_customer
FROM orders;

-- 2. Running total
SELECT
  order_date,
  amount,
  SUM(amount) OVER (ORDER BY order_date) AS running_total
FROM orders;

-- 3. Percentile / nth_value
SELECT
  customer_id,
  amount,
  NTILE(4) OVER (ORDER BY amount) AS quartile
FROM orders;
```

**Gotcha:** Window functions are evaluated after `WHERE`/`GROUP BY`. You can't filter on the window result in the same query — wrap in a CTE or subquery.

---

## Part V — Performance playbook

### Reading `EXPLAIN ANALYZE`

**Why:** Every optimization decision should start here. Guessing wastes hours.

```sql
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 5 AND status = 'pending';
```

**What to look for:**

| Output | Meaning | Action |
|---|---|---|
| `Seq Scan` | Reading every row | Bad on tables > 10k rows → add an index |
| `Index Scan` | Using an index to find rows | Good |
| `Index Only Scan` | Answer found entirely in the index | Best — consider a covering index if not already |
| `Bitmap Heap Scan` | Index used to narrow, then table read | OK — usually fine |
| `Nested Loop` | Small outer set JOINed to indexed inner | Good for small joins |
| `Hash Join` | One side hashed in memory | Good when join keys are unindexed but sets are small |
| `Merge Join` | Both sides sorted, merged | Good on large, already-sorted inputs |
| `cost=X..Y` | Optimizer's estimate | Higher = worse; compare before/after |
| `actual time=X..Y` | Real time in ms | The number that matters |
| `rows=X vs actual rows=Y` | Plan estimate vs reality | Large mismatch = stale statistics → `ANALYZE <table>` |

### Composite index leftmost rule

**Why:** PostgreSQL can only use a composite index if the leftmost column(s) appear in the `WHERE` clause.

```sql
CREATE INDEX idx_orders_status_date ON orders(status, order_date);

-- Uses the index
WHERE status = 'pending';
WHERE status = 'pending' AND order_date > '2024-01-01';

-- Does NOT use the index (status is leftmost; order_date alone doesn't help)
WHERE order_date > '2024-01-01';
```

**Design rule:** Order composite-index columns by selectivity — the most-filtered column first.

### Partial + covering indexes

**Partial** — index a subset for queries that always target that subset:

```sql
-- 95% of queries on orders target 'pending' — index only those
CREATE INDEX idx_orders_pending_created ON orders(created_at)
  WHERE status = 'pending';
```

**Covering** — include all columns a query needs so the table is never touched:

```sql
-- Query: SELECT email, full_name FROM users WHERE id = ?
CREATE INDEX idx_users_id_cover ON users(id) INCLUDE (email, full_name);
-- EXPLAIN will show "Index Only Scan" — fastest possible read
```

### N+1 detection and fix (Prisma)

**Why:** The #1 real-world performance killer. One page load → one parent query + N child queries.

```ts
// BAD — Prisma's default lazy loading (or similar app patterns)
const orders = await prisma.order.findMany({ where: { customer_id: 5 } });
for (const order of orders) {
  const items = await prisma.orderItem.findMany({ where: { order_id: order.id } });  // N queries
}

// GOOD — one query with include
const orders = await prisma.order.findMany({
  where: { customer_id: 5 },
  include: { items: true },        // single joined query
});
```

**How to detect:** enable Prisma query logging in dev:

```ts
new PrismaClient({ log: ["query"] });
// Watch for many similar queries in a tight loop on a single request
```

### Anti-patterns — the big four

**Why:** Each of these quietly defeats your indexes.

```sql
-- 1. Function on indexed column — index NOT used
WHERE EXTRACT(YEAR FROM order_date) = 2024       -- BAD
WHERE order_date >= '2024-01-01'
  AND order_date <  '2025-01-01'                 -- GOOD

-- 2. Implicit type conversion — index MAY not be used
WHERE order_id = '12345'    -- BAD (order_id is INT, '12345' is TEXT)
WHERE order_id = 12345      -- GOOD

-- 3. OFFSET past page ~100 — linear scan every time
ORDER BY id LIMIT 20 OFFSET 10000      -- BAD
WHERE id > :last_seen_id ORDER BY id LIMIT 20   -- GOOD (keyset)

-- 4. N+1 in application code — see above
```

---

## Part VI — Multi-tenant + RBAC

### The `company_id` 4-point checklist

**Why:** In a shared-DB multi-tenant system, one missing `WHERE company_id = ?` leaks everyone's data. Paranoia is the baseline.

1. **Every table** (except truly global lookup tables) has a `company_id` column, NOT NULL, indexed.
2. **Every query** includes `company_id` in the `WHERE` clause.
3. **Every composite index** has `company_id` as the **leftmost** column.
4. **Every API endpoint** derives `company_id` from the authenticated session (JWT claim, not request body).

```prisma
model Order {
  id          BigInt @id @default(autoincrement())
  company_id  BigInt                                    // 1. required
  customer_id BigInt
  // ...

  @@index([company_id, customer_id])                    // 3. leftmost
  @@index([company_id, status])
  @@map("orders")
}
```

```ts
// 2. every query
const orders = await prisma.order.findMany({
  where: {
    company_id: req.user.company_id,                   // 4. from auth, NOT req.body
    deleted_at: null,
  },
});
```

**This boilerplate is single-tenant by default** (no `company_id`). Add the column + checklist the moment you introduce the second tenant.

### Custom fields — JSONB first, EAV last

**Why:** Users always ask for "one more field we can add ourselves." JSONB handles 90% of it without a schema change.

**JSONB approach (recommended):**

```prisma
model Customer {
  id             BigInt @id @default(autoincrement())
  company_id     BigInt
  custom_fields  Json   @default("{}")       // JSONB

  @@index([company_id])
  @@map("customers")
}
```

```ts
// Write
await prisma.customer.update({
  where: { id },
  data: { custom_fields: { gst_number: "27AAAAA0000A1Z5", priority: "high" } },
});

// Read — Prisma JSON filter
await prisma.customer.findMany({
  where: {
    company_id,
    custom_fields: { path: ["priority"], equals: "high" },
  },
});
```

```sql
-- Optional: GIN index for @> queries across many keys
CREATE INDEX idx_customers_custom ON customers USING GIN (custom_fields);
```

**EAV (Entity-Attribute-Value)** — use only when you need per-field type validation, required flags, and reporting across custom fields at scale. Reach for this last; it's complex to query and slow to report on.

### RBAC — this boilerplate's exact model

**Why:** Centralize permissions in the DB, let app code check them, invalidate cache on mutation.

Already implemented in this boilerplate:

```
User ↔ UserRole ↔ Role ↔ RolePermission ↔ Permission
```

- Add a permission: `.claude/skills/add-permission` (the skill covers constant, seed, middleware wiring, cache invalidation).
- Permissions live in `src/constant/permissions.constant.ts` as `P.USER_READ`, `P.USER_WRITE`, etc.
- Middleware: `requirePermission(P.X)` from `@app/middleware` — resolves the user's set from `user_roles → role_permissions → permission.key`, cached in Redis for 60s.
- Invalidate on any write that mutates `user_roles` or `role_permissions`:

```ts
import { delKey } from "@app/db/redis";
import { userPermissionsKey } from "@app/utils/cacheKeys";

await delKey(userPermissionsKey(userId));
```

See `src/services/user.service.ts::update()` for the canonical pattern.

---

## Part VII — Production

### Safe migration patterns

**Why:** A wrong migration takes down production. These three patterns cover the common risky cases without downtime.

**Add a column** (3-step, zero downtime):

```sql
-- Step 1: add as nullable (instant)
ALTER TABLE customers ADD COLUMN loyalty_points INT;

-- Step 2: backfill in batches (no table lock)
UPDATE customers SET loyalty_points = 0
WHERE loyalty_points IS NULL
  AND id BETWEEN 1 AND 10000;  -- repeat in batches

-- Step 3: add NOT NULL once backfilled
ALTER TABLE customers ALTER COLUMN loyalty_points SET NOT NULL;
```

**Drop a column** (deferred — never in one deploy):

```
Deploy N:    stop writing to / reading from the column (application change only)
Deploy N+1:  ALTER TABLE customers DROP COLUMN old_field;  (once you're sure nothing uses it)
```

**Create an index without locking writes** — the single most important PSQL keyword in production:

```sql
CREATE INDEX CONCURRENTLY idx_orders_status_date ON orders(status, order_date);
```

**Notes:** `CONCURRENTLY` takes longer (it does two passes) but doesn't block writes. **Cannot run inside a transaction** — make it its own migration file. If the command fails partway, you'll have an `INVALID` index left behind — drop it with `DROP INDEX CONCURRENTLY idx_name;` and retry.

### Audit trigger pattern (ready to paste)

**Why:** "Who changed this row, from what, to what, when?" is a question every compliance review asks.

```sql
-- 1. Audit table
CREATE TABLE audit_log (
  id         BIGSERIAL PRIMARY KEY,
  table_name TEXT      NOT NULL,
  record_id  TEXT      NOT NULL,
  action     TEXT      NOT NULL CHECK (action IN ('INSERT','UPDATE','DELETE')),
  old_data   JSONB,
  new_data   JSONB,
  changed_by BIGINT,
  changed_at TIMESTAMPTZ DEFAULT NOW(),
  ip_address INET
);
CREATE INDEX idx_audit_table_record ON audit_log(table_name, record_id);
CREATE INDEX idx_audit_changed_at   ON audit_log(changed_at);

-- 2. Generic trigger function
CREATE OR REPLACE FUNCTION log_audit_changes()
RETURNS TRIGGER AS $$
BEGIN
  INSERT INTO audit_log (table_name, record_id, action, old_data, new_data)
  VALUES (
    TG_TABLE_NAME,
    COALESCE((NEW->>'id')::text, (OLD->>'id')::text),
    TG_OP,
    CASE WHEN TG_OP = 'INSERT' THEN NULL ELSE row_to_json(OLD)::jsonb END,
    CASE WHEN TG_OP = 'DELETE' THEN NULL ELSE row_to_json(NEW)::jsonb END
  );
  RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;

-- 3. Attach to any table you want audited
CREATE TRIGGER audit_orders
  AFTER INSERT OR UPDATE OR DELETE ON orders
  FOR EACH ROW EXECUTE FUNCTION log_audit_changes();
```

**Who changed it?** The trigger doesn't know `req.user.id`. Two options:
- App code sets `SET LOCAL app.current_user_id = 42;` at transaction start; the trigger reads via `current_setting('app.current_user_id', true)`.
- Audit is for the *system*; pair with application-level logging for "who."

### Backup — the 3-2-1 rule

**Why:** Every production system has an incident eventually. Backups determine if your business survives.

- **3** copies of the data (prod + 2 backups)
- **2** different storage types (disk + cloud, or local + S3)
- **1** offsite (not in the same AZ / region as prod)

**Logical backup** (good for smaller DBs, portable, restore piece-by-piece):

```bash
pg_dump -U postgres -d mydb --format=custom --compress=9 --file=backup.dump

# Restore
pg_restore -U postgres -d mydb_new backup.dump
```

**Physical backup + WAL archiving** (production-scale, point-in-time recovery): use `pg_basebackup` + WAL archive. Managed Postgres (RDS, Cloud SQL, Neon) typically handles this for you.

**The rule that matters most:** **restore to a test server monthly and verify the data.** A backup you've never tested is hope, not engineering.

### Connection pooling — PgBouncer

**Why:** Opening a Postgres connection costs ~50–100 ms. A busy app makes thousands of reqs/sec → you'll exhaust the DB's `max_connections` just on handshakes.

**Rule of thumb:** Put PgBouncer (or equivalent) in front of Postgres in any production environment with > 1 API server.

**Pool sizing formula:**

```
pool_size_per_app_server = (postgres.max_connections * 0.8) / number_of_app_servers
```

Example: `max_connections = 200`, 4 API servers → each app server pool size = 40.

**Prisma + PgBouncer:** Use transaction-mode pooling and append `?pgbouncer=true&connection_limit=1` to `DATABASE_URL` if using Prisma in serverless. For long-lived servers, session-mode works too.

### Read replicas + replication lag

**Why:** Scales read capacity; offloads reporting/analytics from the primary.

**Pattern:**
- Primary handles all writes.
- One or more replicas continuously receive WAL and stay ~milliseconds behind.
- App code routes reads to replicas, writes to primary.

**Lag gotchas:**
- "Read your own writes" fails if you write to primary then immediately read from replica. Either read from primary after a write, or use sticky routing.
- Long-running transactions on the replica can delay replication (vacuum issues). Keep replica-side queries short.

**When NOT to use:** small systems, or workloads where replica lag breaks UX (live balances, real-time inventory after a sale).

---

## Part VIII — Decision trees

### Normalize vs JSONB?

```
Will you filter / sort / JOIN on this field in > 2 places?
├─ YES → Normalize it out to its own column (or its own table if it has attributes)
└─ NO  → Does it vary by tenant / record type?
         ├─ YES → JSONB
         └─ NO  → Normalize it (it's a first-class attribute)
```

### Should I index this column?

```
Is the table > 10k rows?
├─ NO  → Don't bother. Full scans are cheap on small tables.
└─ YES → Does a query filter, sort, or JOIN on this column?
         ├─ NO  → Don't index.
         └─ YES → Is the column highly selective (many distinct values)?
                  ├─ YES → Create a B-Tree index.
                  └─ NO  → Are you always filtering with another column alongside?
                           ├─ YES → Composite index, selective column first.
                           └─ NO  → Partial index on the subset you actually query.
```

### Partition this table?

```
Will the table exceed 50M rows in 2 years?
├─ NO  → Don't partition. Index properly.
└─ YES → Do you always filter by a natural key (date, tenant)?
         ├─ NO  → Partitioning won't help — fix indexes instead.
         └─ YES → Range-partition (dates) or list-partition (tenants).
                  Can you drop old partitions for retention?
                  ├─ YES → Perfect use case.
                  └─ NO  → Still a good fit for query pruning.
```

### Regular vs materialized view?

```
Is the underlying query fast already (< 100ms)?
├─ YES → Regular view (abstraction for readability / security).
└─ NO  → Can the results be a few minutes stale?
         ├─ NO  → Fix the query with indexes or denormalization.
         └─ YES → Materialized view + scheduled `REFRESH CONCURRENTLY`.
```

### Which isolation level?

```
Does this transaction involve money / inventory / double-spend risk?
├─ YES → SERIALIZABLE (plan for retry on serialization errors).
└─ NO  → Does it do a multi-step read that must see one consistent snapshot?
         ├─ YES → REPEATABLE READ.
         └─ NO  → READ COMMITTED (default — use this).
```

---

## Cross-references

- **Prisma conventions** → [CLAUDE.md §5](../CLAUDE.md)
- **Timezone rules** → [CLAUDE.md §9](../CLAUDE.md) + `.claude/skills/timezone-safe-query`
- **Adding a permission** → `.claude/skills/add-permission`
- **Migration how-to** → `.claude/skills/create-migration`
- **Query optimization step-by-step** → `.claude/skills/optimize-prisma-query`
- **Redis caching patterns** → `.claude/skills/redis-cache-pattern`
- **System architect review** → `.claude/agents/system-architect-agent` (invoke for schema/architecture review grounded in this handbook)

---

*This handbook is the code-time reference. For the "why it matters" philosophy and long-form explanations, pair it with a team discussion — it's deliberately terse.*