What percentage of engineering documentation is actually accurate?

In audits across 12 companies, 40-60% of docs were outdated. 70%+ of engineers reported finding incorrect information monthly. 90%+ of 'comprehensive' architecture docs hadn't been updated in 6+ months.

What are the only four documentation types engineers actually use?

Architecture Decision Records (ADRs), runbooks, onboarding guides, and API contracts. Everything else has a half-life of roughly 6 weeks before it's outdated and actively misleading.

Why are wrong docs worse than missing docs?

When an engineer follows outdated instructions and breaks something, they lose trust in all documentation. They ask a colleague instead, defeating the purpose of writing docs at all.

Documentation That Engineers Actually Read

TL;DR

Most engineering documentation is written once and never read. Not because engineers don't value docs... because the docs don't match how engineers actually look for information. Engineers don't read documentation top to bottom like a textbook. They search for answers to specific questions at specific moments: "How do I deploy?" "What does this error mean?" "Why was this architecture decision made?" The four documentation types that engineers actually use: decision records (ADRs), runbooks, onboarding guides, and API contracts. Everything else... comprehensive architecture overviews, process documents, team wikis... has a half-life of about 6 weeks before it's outdated and actively misleading. Write less documentation, but keep what you write accurate.

Part of the Engineering Leadership: Founder to CTO ... a comprehensive guide to scaling engineering teams and practices.

Why Documentation Fails

I've audited engineering documentation at 12 companies in the past 2 years. These numbers are anecdotal ... drawn from my advisory work, not a formal study ... but the pattern is consistent:

40-60% of docs were outdated (didn't match current codebase or processes)
70%+ of engineers reported finding incorrect information in docs at least monthly
90%+ of "comprehensive" architecture documents hadn't been updated in 6+ months

The problem isn't motivation. Every team I've worked with wants good documentation. The problem is that most documentation strategies optimize for comprehensiveness instead of accuracy... and comprehensive docs that are wrong are worse than no docs at all.

Wrong documentation costs more than missing documentation. When an engineer follows outdated instructions and breaks something, they lose trust in all documentation. From that point forward, they ask a colleague instead of reading the docs... which defeats the entire purpose.

The Four Types That Work

Type 1: Architecture Decision Records (ADRs)

ADRs capture the why behind technical decisions. They're the single most valuable documentation type because they answer a question that source code never answers: "Why did we choose this approach over the alternatives?"


# ADR-007: Use PostgreSQL Row-Level Security for Tenant Isolation

## Status: Accepted (2026-01-15)

## Context

Our SaaS application needs to isolate tenant data. We evaluated three approaches:

1. Separate databases per tenant
2. Shared database with application-level filtering
3. Shared database with PostgreSQL Row-Level Security (RLS)

## Decision

We chose PostgreSQL RLS (option 3).

## Rationale

- **Separate databases** (option 1) provides strong isolation but makes schema migrations
  O(N) where N = number of tenants. At 500+ tenants, migrations take 30+ minutes.
- **Application filtering** (option 2) requires every query to include a WHERE clause.
  A single missed filter exposes cross-tenant data. This has caused data breaches at
  companies I've advised.
- **RLS** (option 3) enforces isolation at the database layer. A missing WHERE clause
  returns zero rows instead of another tenant's data. Fail-safe by default.

## Trade-offs

- RLS adds 2-5% query overhead (measured on our workload)
- Requires SET app.tenant_id on every connection (handled via Prisma middleware)
- Debugging is harder... queries return empty results instead of errors when tenant context is wrong

## Consequences

- All database queries automatically filtered by tenant
- New engineers can't accidentally write cross-tenant queries
- Monitoring needed for the Prisma middleware setting tenant context

Why ADRs work: They're written once, at the moment of decision, when context is fresh. They don't need updating because the decision itself doesn't change... even if the implementation evolves.

Maintenance cost: Near zero. ADRs are immutable records. If a decision is reversed, write a new ADR that supersedes the old one.

Type 2: Runbooks

Runbooks are step-by-step procedures for operational tasks. They answer: "How do I do this specific thing right now?"


# Runbook: Deploy Hotfix to Production

## When to Use

Production issue requiring an immediate fix that can't wait for the normal release cycle.

## Prerequisites

- [ ] GitHub access to main branch
- [ ] Cloudflare dashboard access (for rollback if needed)

## Steps

1. Create hotfix branch from main:
   ```bash
   git checkout main && git pull && git checkout -b hotfix/description
   ```

Apply fix, commit, push:


git add <files> && git commit -m "fix: description" && git push -u origin hotfix/description

Create PR targeting main. Add hotfix label.
Get one approval (skip normal two-reviewer requirement for hotfixes).
Merge to main. GitHub Actions deploys automatically.

Verify:


curl -s https://alexmayhew.dev/api/health | jq

If deployment fails, rollback via Cloudflare Dashboard: Workers & Pages > alexmayhew-dev > Deployments > Rollback

Escalation

If steps above don't resolve the issue, contact [on-call IC] via PagerDuty.



**Why runbooks work:** They're designed for the exact moment when an engineer needs them... under stress, with limited context. Every step is explicit. No assumed knowledge.

**Maintenance strategy:** Every time a runbook is used and a step is wrong, fix the runbook immediately. Treat runbook inaccuracy as a bug with the same priority as a code bug.

### Type 3: Onboarding Guides

New engineer onboarding documentation has the highest ROI of any documentation type. Every hour invested in onboarding docs saves that hour multiplied by every future engineer who joins.

The effective onboarding guide isn't a wiki page... it's a checklist with day-by-day tasks:

```markdown
# Engineering Onboarding: Week 1

## Day 1: Environment Setup

### Morning
- [ ] Clone the monorepo: `git clone git@github.com:org/repo.git`
- [ ] Install dependencies: `npm install` (requires Node 20+)
- [ ] Set up local database: `docker compose up -d`
- [ ] Run the dev server: `npm run dev` ... verify http://localhost:3001 loads
- [ ] Run tests: `npm test` ... all should pass

### Afternoon
- [ ] Read ADR-001 through ADR-010 (architecture context)
- [ ] Read the incident postmortem archive (last 5 postmortems)
- [ ] Set up your IDE: TypeScript strict mode, ESLint plugin, Prettier

## Day 2: First Contribution

- [ ] Pick a "good first issue" from the backlog
- [ ] Open a PR. Your onboarding buddy will review within 2 hours.
- [ ] Submit your first PR by end of day (any size is fine)

## Day 3-5: Domain Context

- [ ] Shadow the on-call engineer for 2 hours
- [ ] Read the runbook for the service area you're joining
- [ ] Meet your team lead for architecture walkthrough (30 min)
- [ ] Complete second PR with a slightly larger scope

Why onboarding guides work: They remove decision-making from the new hire's first week. The checklist tells them exactly what to do, in what order, with expected outcomes at each step.

Maintenance strategy: Every new engineer adds one improvement to the onboarding guide as their second week task. The guide improves with every hire.

Type 4: API Contracts

API documentation that's generated from code stays accurate because it changes when the code changes.


// OpenAPI spec generated from code (e.g., tRPC, Zod, or decorators)
// The contract IS the code ... there's no separate document to maintain

const createOrderSchema = z.object({
	customerId: z.string().uuid().describe("The customer placing the order"),
	items: z
		.array(
			z.object({
				productId: z.string().uuid(),
				quantity: z.number().int().positive(),
			})
		)
		.min(1)
		.describe("Order line items (at least one required)"),
	currency: z.enum(["USD", "EUR", "GBP"]).default("USD"),
});

Why API contracts work: They're enforced by the compiler or runtime. If the contract says the field is required and the code doesn't send it, the build breaks.

Maintenance cost: Zero. The documentation is the code.

What to Stop Documenting

These documentation types consistently fail:

Type	Why It Fails	Alternative
Comprehensive architecture overview	Outdated within 2 months	ADRs for decisions, code for current state
Process documents	Never read, frequently wrong	Automated checks (CI) that enforce process
Meeting notes	Written for the writer, not the reader	Decision records (ADRs) for outcomes
"How it works" deep-dives	Outdated when code changes	Well-named code + inline comments
Style guides	Replaced by automated formatters	Prettier/ESLint/gofmt configs

The honest truth: if documentation can be replaced by automation, replace it. A linter rule that enforces a naming convention is more reliable than a style guide that describes it.

Documentation Decay and Prevention

Documentation decays at a predictable rate. The half-life depends on the type:

Documentation Type	Half-Life	Decay Cause
API contracts (code-generated)	Infinite	Changes when code changes
ADRs	2-5 years	Decisions rarely change
Runbooks	3-6 months	Tooling and processes evolve
Onboarding guides	2-4 months	Dependencies and setup change
Architecture overviews	4-8 weeks	Code changes faster than docs

Prevention Strategies

1. Docs-as-code. Store documentation in the same repository as the code. PRs that change code should include documentation updates. Code review catches doc drift.

2. Automated freshness checks.


# GitHub Action: flag stale docs
name: Doc Freshness Check
on:
  schedule:
    - cron: "0 9 * * 1" # Every Monday

jobs:
  check-freshness:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Find stale docs
        run: |
          find docs/ -name "*.md" -mtime +90 | while read file; do
            echo "::warning file=$file::This document hasn't been updated in 90+ days. Verify accuracy or archive."
          done

3. Owner assignment. Every document has an owner. When the owner leaves the team, ownership transfers. Unowned docs get archived.

When to Apply This

Your team is growing past 5 engineers and knowledge transfer is becoming a bottleneck
New engineers take more than 2 weeks to become productive
Engineers frequently encounter outdated documentation
On-call engineers spend significant time figuring out procedures during incidents

When NOT to Apply This

Solo developer or two-person team... verbal communication is faster
Prototype or throwaway project... documentation outliving the code is waste
First week of a startup... ship first, document later

Building an engineering culture where documentation actually works? I help teams implement documentation strategies that scale with their growth.

Technical Advisor for Startups ... Engineering culture and process guidance
Next.js Development for SaaS ... Teams that ship with documentation built in
Technical Due Diligence ... Engineering process maturity assessment

Continue Reading

This post is part of the Engineering Leadership: Founder to CTO ... covering hiring, team scaling, technical strategy, and operational excellence.

TL;DR

Part of the Engineering Leadership: Founder to CTO ... a comprehensive guide to scaling engineering teams and practices.

Why Documentation Fails

I've audited engineering documentation at 12 companies in the past 2 years. These numbers are anecdotal ... drawn from my advisory work, not a formal study ... but the pattern is consistent:

40-60% of docs were outdated (didn't match current codebase or processes)
70%+ of engineers reported finding incorrect information in docs at least monthly
90%+ of "comprehensive" architecture documents hadn't been updated in 6+ months

The Four Types That Work

Type 1: Architecture Decision Records (ADRs)


# ADR-007: Use PostgreSQL Row-Level Security for Tenant Isolation

## Status: Accepted (2026-01-15)

## Context

Our SaaS application needs to isolate tenant data. We evaluated three approaches:

1. Separate databases per tenant
2. Shared database with application-level filtering
3. Shared database with PostgreSQL Row-Level Security (RLS)

## Decision

We chose PostgreSQL RLS (option 3).

## Rationale

- **Separate databases** (option 1) provides strong isolation but makes schema migrations
  O(N) where N = number of tenants. At 500+ tenants, migrations take 30+ minutes.
- **Application filtering** (option 2) requires every query to include a WHERE clause.
  A single missed filter exposes cross-tenant data. This has caused data breaches at
  companies I've advised.
- **RLS** (option 3) enforces isolation at the database layer. A missing WHERE clause
  returns zero rows instead of another tenant's data. Fail-safe by default.

## Trade-offs

- RLS adds 2-5% query overhead (measured on our workload)
- Requires SET app.tenant_id on every connection (handled via Prisma middleware)
- Debugging is harder... queries return empty results instead of errors when tenant context is wrong

## Consequences

- All database queries automatically filtered by tenant
- New engineers can't accidentally write cross-tenant queries
- Monitoring needed for the Prisma middleware setting tenant context

Why ADRs work: They're written once, at the moment of decision, when context is fresh. They don't need updating because the decision itself doesn't change... even if the implementation evolves.

Maintenance cost: Near zero. ADRs are immutable records. If a decision is reversed, write a new ADR that supersedes the old one.

Type 2: Runbooks

Runbooks are step-by-step procedures for operational tasks. They answer: "How do I do this specific thing right now?"


# Runbook: Deploy Hotfix to Production

## When to Use

Production issue requiring an immediate fix that can't wait for the normal release cycle.

## Prerequisites

- [ ] GitHub access to main branch
- [ ] Cloudflare dashboard access (for rollback if needed)

## Steps

1. Create hotfix branch from main:
   ```bash
   git checkout main && git pull && git checkout -b hotfix/description
   ```

Apply fix, commit, push:


git add <files> && git commit -m "fix: description" && git push -u origin hotfix/description

Create PR targeting main. Add hotfix label.
Get one approval (skip normal two-reviewer requirement for hotfixes).
Merge to main. GitHub Actions deploys automatically.

Verify:


curl -s https://alexmayhew.dev/api/health | jq

If deployment fails, rollback via Cloudflare Dashboard: Workers & Pages > alexmayhew-dev > Deployments > Rollback

Escalation

If steps above don't resolve the issue, contact [on-call IC] via PagerDuty.



**Why runbooks work:** They're designed for the exact moment when an engineer needs them... under stress, with limited context. Every step is explicit. No assumed knowledge.

**Maintenance strategy:** Every time a runbook is used and a step is wrong, fix the runbook immediately. Treat runbook inaccuracy as a bug with the same priority as a code bug.

### Type 3: Onboarding Guides

New engineer onboarding documentation has the highest ROI of any documentation type. Every hour invested in onboarding docs saves that hour multiplied by every future engineer who joins.

The effective onboarding guide isn't a wiki page... it's a checklist with day-by-day tasks:

```markdown
# Engineering Onboarding: Week 1

## Day 1: Environment Setup

### Morning
- [ ] Clone the monorepo: `git clone git@github.com:org/repo.git`
- [ ] Install dependencies: `npm install` (requires Node 20+)
- [ ] Set up local database: `docker compose up -d`
- [ ] Run the dev server: `npm run dev` ... verify http://localhost:3001 loads
- [ ] Run tests: `npm test` ... all should pass

### Afternoon
- [ ] Read ADR-001 through ADR-010 (architecture context)
- [ ] Read the incident postmortem archive (last 5 postmortems)
- [ ] Set up your IDE: TypeScript strict mode, ESLint plugin, Prettier

## Day 2: First Contribution

- [ ] Pick a "good first issue" from the backlog
- [ ] Open a PR. Your onboarding buddy will review within 2 hours.
- [ ] Submit your first PR by end of day (any size is fine)

## Day 3-5: Domain Context

- [ ] Shadow the on-call engineer for 2 hours
- [ ] Read the runbook for the service area you're joining
- [ ] Meet your team lead for architecture walkthrough (30 min)
- [ ] Complete second PR with a slightly larger scope

Why onboarding guides work: They remove decision-making from the new hire's first week. The checklist tells them exactly what to do, in what order, with expected outcomes at each step.

Maintenance strategy: Every new engineer adds one improvement to the onboarding guide as their second week task. The guide improves with every hire.

Type 4: API Contracts

API documentation that's generated from code stays accurate because it changes when the code changes.


// OpenAPI spec generated from code (e.g., tRPC, Zod, or decorators)
// The contract IS the code ... there's no separate document to maintain

const createOrderSchema = z.object({
	customerId: z.string().uuid().describe("The customer placing the order"),
	items: z
		.array(
			z.object({
				productId: z.string().uuid(),
				quantity: z.number().int().positive(),
			})
		)
		.min(1)
		.describe("Order line items (at least one required)"),
	currency: z.enum(["USD", "EUR", "GBP"]).default("USD"),
});

Why API contracts work: They're enforced by the compiler or runtime. If the contract says the field is required and the code doesn't send it, the build breaks.

Maintenance cost: Zero. The documentation is the code.

What to Stop Documenting

These documentation types consistently fail:

Type	Why It Fails	Alternative
Comprehensive architecture overview	Outdated within 2 months	ADRs for decisions, code for current state
Process documents	Never read, frequently wrong	Automated checks (CI) that enforce process
Meeting notes	Written for the writer, not the reader	Decision records (ADRs) for outcomes
"How it works" deep-dives	Outdated when code changes	Well-named code + inline comments
Style guides	Replaced by automated formatters	Prettier/ESLint/gofmt configs

The honest truth: if documentation can be replaced by automation, replace it. A linter rule that enforces a naming convention is more reliable than a style guide that describes it.

Documentation Decay and Prevention

Documentation decays at a predictable rate. The half-life depends on the type:

Documentation Type	Half-Life	Decay Cause
API contracts (code-generated)	Infinite	Changes when code changes
ADRs	2-5 years	Decisions rarely change
Runbooks	3-6 months	Tooling and processes evolve
Onboarding guides	2-4 months	Dependencies and setup change
Architecture overviews	4-8 weeks	Code changes faster than docs

Prevention Strategies

1. Docs-as-code. Store documentation in the same repository as the code. PRs that change code should include documentation updates. Code review catches doc drift.

2. Automated freshness checks.


# GitHub Action: flag stale docs
name: Doc Freshness Check
on:
  schedule:
    - cron: "0 9 * * 1" # Every Monday

jobs:
  check-freshness:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Find stale docs
        run: |
          find docs/ -name "*.md" -mtime +90 | while read file; do
            echo "::warning file=$file::This document hasn't been updated in 90+ days. Verify accuracy or archive."
          done

3. Owner assignment. Every document has an owner. When the owner leaves the team, ownership transfers. Unowned docs get archived.

When to Apply This

Your team is growing past 5 engineers and knowledge transfer is becoming a bottleneck
New engineers take more than 2 weeks to become productive
Engineers frequently encounter outdated documentation
On-call engineers spend significant time figuring out procedures during incidents

When NOT to Apply This

Solo developer or two-person team... verbal communication is faster
Prototype or throwaway project... documentation outliving the code is waste
First week of a startup... ship first, document later

Building an engineering culture where documentation actually works? I help teams implement documentation strategies that scale with their growth.

Technical Advisor for Startups ... Engineering culture and process guidance
Next.js Development for SaaS ... Teams that ship with documentation built in
Technical Due Diligence ... Engineering process maturity assessment

Continue Reading

This post is part of the Engineering Leadership: Founder to CTO ... covering hiring, team scaling, technical strategy, and operational excellence.

Documentation That Engineers Actually Read

TL;DR

Why Documentation Fails

The Four Types That Work

Type 1: Architecture Decision Records (ADRs)

Type 2: Runbooks

Escalation

Type 4: API Contracts

What to Stop Documenting

Documentation Decay and Prevention

Prevention Strategies

When to Apply This

When NOT to Apply This

Continue Reading

More in This Series

Related Insights

Get insights like this weekly

Documentation That Engineers Actually Read

TL;DR

Why Documentation Fails

The Four Types That Work

Type 1: Architecture Decision Records (ADRs)

Type 2: Runbooks

Escalation

Type 4: API Contracts

What to Stop Documenting

Documentation Decay and Prevention

Prevention Strategies

When to Apply This

When NOT to Apply This

Continue Reading

More in This Series

Related Insights

Get insights like this weekly

●TL;DR

●Why Documentation Fails

●The Four Types That Work

Type 1: Architecture Decision Records (ADRs)

Type 2: Runbooks

●Escalation

Type 4: API Contracts

●What to Stop Documenting

●Documentation Decay and Prevention

Prevention Strategies

●When to Apply This

●When NOT to Apply This

●Continue Reading

More in This Series

Related Guides

●Related Insights

Get insights like this weekly

●TL;DR

●Why Documentation Fails

●The Four Types That Work

Type 1: Architecture Decision Records (ADRs)

Type 2: Runbooks

●Escalation

Type 4: API Contracts

●What to Stop Documenting

●Documentation Decay and Prevention

Prevention Strategies

●When to Apply This

●When NOT to Apply This

●Continue Reading

More in This Series

Related Guides

●Related Insights

Get insights like this weekly

TL;DR

Why Documentation Fails

The Four Types That Work

Escalation

What to Stop Documenting

Documentation Decay and Prevention

When to Apply This

When NOT to Apply This

Continue Reading

Related Insights

TL;DR

Why Documentation Fails

The Four Types That Work

Escalation

What to Stop Documenting

Documentation Decay and Prevention

When to Apply This

When NOT to Apply This

Continue Reading

Related Insights