TL;DR
Billing is the only part of your SaaS where a bug directly costs you money. An incorrect proration calculation, a missed webhook, or a race condition in subscription upgrades... each one leaks revenue. I've helped teams recover from billing bugs that cost $10K-100K before detection. The architecture that prevents this: Stripe as the source of truth for subscription state (never your database), a webhook-driven state machine that handles every lifecycle event, idempotent payment processing that survives duplicate webhooks, and a reconciliation job that catches any drift between Stripe and your application. The most common mistake: polling Stripe's API to check subscription status instead of reacting to webhooks. Polling misses state transitions, creates race conditions, and burns your rate limit.
Part of the SaaS Architecture Decision Framework ... a comprehensive guide to architecture decisions from MVP to scale.
Stripe as Source of Truth
The first architectural decision: where does subscription state live? The answer is Stripe. Your database stores a cache of Stripe state for fast reads, but Stripe is authoritative.
Why this matters:
| Approach | Risk |
|---|---|
| Your DB is authoritative | Drift between your DB and Stripe. Customer pays but your app doesn't reflect the plan. Or worse: customer's payment fails but your app still grants access. |
| Stripe is authoritative | Your DB might be stale by seconds, but Stripe's state is always correct. Webhooks keep you in sync. |
// Database schema: cache of Stripe state
CREATE TABLE subscriptions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id),
stripe_subscription_id TEXT UNIQUE NOT NULL,
stripe_customer_id TEXT NOT NULL,
status TEXT NOT NULL, -- 'active', 'past_due', 'canceled', 'trialing'
plan_id TEXT NOT NULL,
current_period_start TIMESTAMPTZ NOT NULL,
current_period_end TIMESTAMPTZ NOT NULL,
cancel_at_period_end BOOLEAN DEFAULT false,
synced_at TIMESTAMPTZ DEFAULT NOW(),
CONSTRAINT valid_status CHECK (status IN (
'active', 'past_due', 'canceled', 'trialing',
'incomplete', 'incomplete_expired', 'unpaid', 'paused'
))
);
The synced_at column is critical. It tells you when this row was last updated from Stripe. Any query that needs authoritative state should check synced_at and re-sync if it's stale.
The Webhook State Machine
Stripe communicates subscription lifecycle events through webhooks. Your application must handle every event type correctly... and handle them idempotently, because Stripe may send the same event multiple times.
Event Flow
customer.subscription.created
│
▼
customer.subscription.updated (status: trialing)
│
├── Trial converts → invoice.paid → customer.subscription.updated (status: active)
│
└── Trial expires → customer.subscription.updated (status: past_due)
│
├── Payment retried → invoice.paid → status: active
│
└── All retries fail → customer.subscription.deleted
Webhook Handler
import Stripe from "stripe";
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY);
export async function handleWebhook(request: Request): Promise<Response> {
const body = await request.text();
const signature = request.headers.get("stripe-signature");
let event: Stripe.Event;
try {
event = stripe.webhooks.constructEvent(body, signature!, process.env.STRIPE_WEBHOOK_SECRET!);
} catch (err) {
return new Response("Invalid signature", { status: 400 });
}
// Idempotency: check if we've already processed this event
const processed = await db.query("SELECT id FROM processed_events WHERE stripe_event_id = $1", [
event.id,
]);
if (processed.rowCount > 0) {
return new Response("Already processed", { status: 200 });
}
try {
await processEvent(event);
// Record successful processing
await db.query(
"INSERT INTO processed_events (stripe_event_id, event_type, processed_at) VALUES ($1, $2, NOW())",
[event.id, event.type]
);
return new Response("OK", { status: 200 });
} catch (err) {
// Return 500 so Stripe retries
console.error("Webhook processing failed:", err);
return new Response("Processing failed", { status: 500 });
}
}
async function processEvent(event: Stripe.Event) {
switch (event.type) {
case "customer.subscription.created":
case "customer.subscription.updated":
await syncSubscription(event.data.object as Stripe.Subscription);
break;
case "customer.subscription.deleted":
await handleCancellation(event.data.object as Stripe.Subscription);
break;
case "invoice.paid":
await handleSuccessfulPayment(event.data.object as Stripe.Invoice);
break;
case "invoice.payment_failed":
await handleFailedPayment(event.data.object as Stripe.Invoice);
break;
case "customer.subscription.trial_will_end":
await sendTrialEndingNotification(event.data.object as Stripe.Subscription);
break;
}
}
Sync Function
async function syncSubscription(subscription: Stripe.Subscription) {
await db.query(
`
INSERT INTO subscriptions (
stripe_subscription_id, stripe_customer_id, tenant_id,
status, plan_id, current_period_start, current_period_end,
cancel_at_period_end, synced_at
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, NOW())
ON CONFLICT (stripe_subscription_id) DO UPDATE SET
status = EXCLUDED.status,
plan_id = EXCLUDED.plan_id,
current_period_start = EXCLUDED.current_period_start,
current_period_end = EXCLUDED.current_period_end,
cancel_at_period_end = EXCLUDED.cancel_at_period_end,
synced_at = NOW()
`,
[
subscription.id,
subscription.customer,
await getTenantByStripeCustomer(subscription.customer as string),
subscription.status,
subscription.items.data[0].price.id,
new Date(subscription.current_period_start * 1000),
new Date(subscription.current_period_end * 1000),
subscription.cancel_at_period_end,
]
);
// Update tenant's feature flags based on plan
await updateTenantEntitlements(
await getTenantByStripeCustomer(subscription.customer as string),
subscription.items.data[0].price.id,
subscription.status
);
}
Entitlement Management
The subscription plan determines what features the customer can access. This mapping... plan to features... is the most business-critical code in your application.
// Plan → entitlements mapping
const PLAN_ENTITLEMENTS: Record<string, Entitlements> = {
price_starter_monthly: {
maxUsers: 5,
maxProjects: 10,
features: ["basic_analytics", "email_support"],
apiRateLimit: 100, // requests per minute
storageGB: 5,
},
price_growth_monthly: {
maxUsers: 25,
maxProjects: 50,
features: ["basic_analytics", "advanced_analytics", "priority_support", "api_access"],
apiRateLimit: 1000,
storageGB: 50,
},
price_enterprise_monthly: {
maxUsers: -1, // unlimited
maxProjects: -1,
features: [
"basic_analytics",
"advanced_analytics",
"priority_support",
"api_access",
"sso",
"audit_log",
"custom_branding",
],
apiRateLimit: 10000,
storageGB: 500,
},
};
// Check entitlement at the feature boundary
async function checkEntitlement(tenantId: string, feature: string): Promise<boolean> {
const subscription = await db.query(
"SELECT plan_id, status FROM subscriptions WHERE tenant_id = $1",
[tenantId]
);
if (!subscription.rows[0] || subscription.rows[0].status !== "active") {
return false;
}
const entitlements = PLAN_ENTITLEMENTS[subscription.rows[0].plan_id];
return entitlements?.features.includes(feature) ?? false;
}
Never hardcode plan checks. if (plan === 'enterprise') scattered through your codebase becomes unmaintainable when you add a new plan. Use an entitlement service that maps plans to capabilities.
Handling Edge Cases That Leak Revenue
Edge Case 1: Subscription Upgrade Mid-Cycle
Customer upgrades from Starter ($29/mo) to Growth ($99/mo) on day 15 of a 30-day billing cycle. Stripe prorates automatically if configured, but your application needs to grant the new features immediately.
async function upgradeSubscription(
tenantId: string,
newPriceId: string
): Promise<Stripe.Subscription> {
const subscription = await getActiveSubscription(tenantId);
const updated = await stripe.subscriptions.update(subscription.stripe_subscription_id, {
items: [
{
id: subscription.items.data[0].id,
price: newPriceId,
},
],
proration_behavior: "create_prorations",
payment_behavior: "error_if_incomplete",
});
// Don't wait for webhook ... update entitlements immediately
// The webhook will confirm the state later
await updateTenantEntitlements(tenantId, newPriceId, "active");
return updated;
}
payment_behavior: 'error_if_incomplete' ensures the upgrade fails if the prorated payment doesn't succeed. Without this, a customer with a declined card gets upgraded to a higher plan for free until the next billing cycle.
Edge Case 2: Failed Payment Retry
When a payment fails, Stripe retries with exponential backoff... up to roughly 8 attempts over 3 days by default. During this window, the subscription status is past_due. Should the customer keep access?
The answer depends on your business model:
| Strategy | Behavior | Revenue Impact |
|---|---|---|
| Immediate restriction | Lock features on first failure | Low revenue leakage, higher churn |
| Grace period (7 days) | Full access for 7 days, then restrict | Balanced |
| Full access until canceled | No restriction during retries | Higher revenue leakage, lower churn |
Most B2B SaaS companies use a 7-day grace period:
async function checkAccess(tenantId: string): Promise<AccessLevel> {
const sub = await getSubscription(tenantId);
if (sub.status === "active" || sub.status === "trialing") {
return "full";
}
if (sub.status === "past_due") {
const daysPastDue = differenceInDays(new Date(), sub.current_period_end);
if (daysPastDue <= 7) return "full";
if (daysPastDue <= 14) return "read_only";
return "locked";
}
return "locked";
}
Edge Case 3: The Reconciliation Gap
Despite webhooks, state can drift. A webhook fails to deliver. Your handler has a bug that partially processes an event. The customer updates their payment method on Stripe's hosted page.
Run a reconciliation job daily:
// Daily reconciliation: compare your DB with Stripe's truth
async function reconcileSubscriptions() {
const localSubs = await db.query(
"SELECT stripe_subscription_id, status, plan_id FROM subscriptions WHERE status != 'canceled'"
);
for (const local of localSubs.rows) {
const stripeSub = await stripe.subscriptions.retrieve(local.stripe_subscription_id);
if (stripeSub.status !== local.status || stripeSub.items.data[0]?.price.id !== local.plan_id) {
console.warn("Subscription drift detected", {
subscriptionId: local.stripe_subscription_id,
localStatus: local.status,
stripeStatus: stripeSub.status,
localPlan: local.plan_id,
stripePlan: stripeSub.items.data[0]?.price.id,
});
await syncSubscription(stripeSub);
}
}
}
This catches every drift scenario. Schedule it for off-peak hours and alert if drift is detected... it indicates a webhook processing bug that needs fixing.
Metered Billing and Usage Tracking
For usage-based pricing (API calls, storage, compute time), report usage to Stripe and let Stripe handle the billing math:
// Report usage to Stripe
async function reportUsage(subscriptionItemId: string, quantity: number, timestamp: number) {
await stripe.subscriptionItems.createUsageRecord(subscriptionItemId, {
quantity,
timestamp,
action: "increment", // or 'set' for gauge metrics
});
}
// Batch usage reporting (more efficient)
async function reportDailyUsage(tenantId: string) {
const usage = await db.query(
`
SELECT COUNT(*) as api_calls, SUM(storage_bytes) as storage
FROM usage_events
WHERE tenant_id = $1 AND created_at >= CURRENT_DATE
`,
[tenantId]
);
const subscription = await getSubscription(tenantId);
const apiItem = subscription.items.find((i) => i.price.id.includes("api"));
const storageItem = subscription.items.find((i) => i.price.id.includes("storage"));
if (apiItem) {
await reportUsage(apiItem.id, usage.api_calls, Math.floor(Date.now() / 1000));
}
if (storageItem) {
await reportUsage(
storageItem.id,
Math.ceil(usage.storage / (1024 * 1024 * 1024)),
Math.floor(Date.now() / 1000)
);
}
}
Testing Billing Code
Billing code needs more test coverage than any other part of your application. A bug in a dashboard component shows incorrect data. A bug in billing code costs money.
// Test every state transition
describe("Subscription Lifecycle", () => {
it("grants access when subscription is active", async () => {
await createSubscription({ status: "active", plan: "growth" });
expect(await checkAccess(tenantId)).toBe("full");
expect(await checkEntitlement(tenantId, "advanced_analytics")).toBe(true);
});
it("maintains access during 7-day grace period", async () => {
await createSubscription({ status: "past_due", periodEnd: daysAgo(5) });
expect(await checkAccess(tenantId)).toBe("full");
});
it("restricts to read-only after 7-day grace period", async () => {
await createSubscription({ status: "past_due", periodEnd: daysAgo(10) });
expect(await checkAccess(tenantId)).toBe("read_only");
});
it("locks account after 14 days past due", async () => {
await createSubscription({ status: "past_due", periodEnd: daysAgo(15) });
expect(await checkAccess(tenantId)).toBe("locked");
});
it("handles upgrade proration correctly", async () => {
await createSubscription({ status: "active", plan: "starter" });
expect(await checkEntitlement(tenantId, "api_access")).toBe(false);
await upgradeSubscription(tenantId, "price_growth_monthly");
expect(await checkEntitlement(tenantId, "api_access")).toBe(true);
});
});
Use Stripe's test mode and test clocks for end-to-end billing tests. Test clocks let you simulate the passage of time... advance a subscription by 30 days and verify renewal behavior without waiting.
When to Apply This
- You're building a SaaS with paid subscriptions and need production-grade billing
- Your current billing implementation has known bugs or revenue leakage
- You're migrating from a custom payment integration to Stripe
- You need usage-based or hybrid pricing models
When NOT to Apply This
- Pre-revenue MVP validating product-market fit... use Stripe Checkout with minimal code
- One-time payment products (not subscriptions)... Stripe Checkout handles this without custom architecture
- Marketplace with complex payouts... consider Stripe Connect instead
Building billing that doesn't leak revenue? I help SaaS teams architect Stripe integrations that handle every edge case from day one.
- Technical Advisor for Startups ... Billing architecture decisions
- Next.js Development for SaaS ... Production-grade subscription management
- Technical Due Diligence ... Billing system audit
Continue Reading
This post is part of the SaaS Architecture Decision Framework ... covering database design, infrastructure patterns, API strategy, and billing architecture.
More in This Series
- Multi-Tenancy with Prisma and RLS ... Tenant isolation that billing depends on
- Zero to 10K MRR SaaS Playbook ... The technical journey to first revenue
- The Hidden Tax of "We Support Both" ... Why one billing provider is always better than two
- SOC 2 Compliance Startup Roadmap ... Compliance requirements for billing data
Related Guides
- Event-Driven Architecture for SaaS ... Webhook processing as event handling
- Database Migration Patterns ... Migrating billing tables safely
