TL;DR
A fintech startup with 8 engineers and $3M ARR was planning a 6-month microservices migration. The "problems" they were solving... deployment coupling, scaling issues, team autonomy... had simpler solutions. We implemented those instead: feature flags, database read replicas, and team-based code ownership. Total time: 3 weeks. They're at $12M ARR now, still on the monolith, shipping features 4x faster than competitors who went microservices.
Part of the SaaS Architecture Decision Framework ... a comprehensive guide to architecture decisions from MVP to scale.
The Call That Started It
The CTO reached out after reading my piece on boring technology. His engineering team had convinced the board that microservices were necessary for the next phase of growth.
"We're planning a 6-month architecture overhaul," he said. "Before we commit, I want a second opinion."
The plan: decompose their Python/Django monolith into 12 services. Payment processing. User management. Notifications. Analytics. The works.
The justification:
- "Deployments are too risky... one change affects everything"
- "We can't scale specific parts of the system"
- "Teams step on each other's toes"
- "It's industry best practice"
I asked one question: "What's the actual problem you're solving?"
Silence. Then: "All of the above?"
The Real Problems
After two days of code review and team interviews, the picture was clearer.
Problem 1: Deployment Fear
They deployed weekly because deployments were scary. One bad merge had caused a 4-hour outage three months prior. The team was traumatized.
Root cause: No feature flags. No gradual rollout. All-or-nothing deployments.
Microservices solution: Isolate services so a bad deploy only affects one domain.
Simpler solution: Feature flags + deployment automation. Deploy daily, roll back in seconds.
Problem 2: Database Bottleneck
Their analytics queries were slow. The main PostgreSQL database was hitting CPU limits during business hours.
Root cause: Heavy reporting queries running against the transactional database.
Microservices solution: Separate analytics service with its own database.
Simpler solution: Read replica for analytics. Takes an afternoon to set up.
Problem 3: Team Conflicts
Two teams kept breaking each other's code. The payments team would change a shared model, and the onboarding team's tests would fail.
Root cause: No clear ownership boundaries. Shared models with unclear contracts.
Microservices solution: Separate services with explicit APIs.
Simpler solution: Module boundaries within the monolith. Interface contracts. Code ownership files.
Problem 4: "Best Practice"
They'd read about how Netflix, Uber, and Amazon use microservices.
Root cause: Pattern matching to companies 1000x their size.
Reality: Netflix has 2000+ engineers. They have 8.
The Math That Changed Their Mind
I walked through the real cost of the microservices migration.
Engineering Time
| Task | Estimated Weeks | Engineers |
|---|---|---|
| Service decomposition | 8 | 4 |
| API design and implementation | 4 | 3 |
| Data migration and sync | 6 | 2 |
| Infrastructure (K8s, service mesh) | 6 | 2 |
| Testing and validation | 4 | 4 |
| Total | 28 engineer-weeks |
At their burn rate, 28 engineer-weeks is roughly $250-300K in salary alone. Plus opportunity cost...6 months of features not shipped.
Ongoing Overhead
Microservices aren't free to maintain:
| New Requirement | Monthly Cost |
|---|---|
| Kubernetes cluster | $2-5K |
| Service mesh (Istio/Linkerd) | Engineering time |
| Distributed tracing | $500-2K |
| Log aggregation (12 services) | $1-3K |
| On-call complexity | Burnout |
They'd be spending $5-10K/month on infrastructure they didn't need, plus significant engineering overhead.
The Hidden Cost: Velocity
Here's what microservices advocates don't mention: cross-service features are slower to ship.
Monolith feature: Change database schema, update model, update API, deploy. One PR, one review, one deploy.
Microservices feature: Change schema in service A, update service A API, update service B to call new API, update service C to consume event, deploy all three in order, hope nothing breaks in between.
For a team of 8, this overhead dominates. Every feature touching multiple domains takes 2-3x longer.
What We Did Instead
Week 1: Feature Flags and Deployment Safety
We implemented LaunchDarkly (could have been self-hosted, but speed mattered).
# Before: All-or-nothing feature deployment
def process_payment(user, amount):
# New payment flow - deployed to everyone or no one
return new_payment_processor.charge(user, amount)
# After: Gradual rollout with instant rollback
def process_payment(user, amount):
if feature_flags.is_enabled('new_payment_flow', user_id=user.id):
return new_payment_processor.charge(user, amount)
return legacy_payment_processor.charge(user, amount)
Result: Deploy daily. Roll back a feature in seconds. No more deployment fear.
Week 1: Read Replica for Analytics
We spun up a PostgreSQL read replica and pointed all reporting queries at it.
# Database router for Django
class AnalyticsRouter:
def db_for_read(self, model, **hints):
if model._meta.app_label == 'analytics':
return 'replica'
return 'default'
def db_for_write(self, model, **hints):
return 'default'
Cost: $200/month for the replica.
Result: Analytics queries no longer impacted transactional performance. P99 latency on the main database dropped 40%.
Week 2: Module Boundaries
We drew boundaries within the monolith. No code changes... just documentation and ownership.
app/
├── payments/ # Team: Payments
│ ├── models.py
│ ├── services.py # Public interface
│ └── internal/ # Don't import from outside
├── onboarding/ # Team: Growth
│ ├── models.py
│ ├── services.py
│ └── internal/
├── shared/ # Explicit shared code
│ ├── models.py # Shared models (minimal)
│ └── interfaces.py # Contracts between modules
└── CODEOWNERS # GitHub ownership file
Rules:
- Import only from
services.pyorshared/ - Never import from another module's
internal/ - Shared models require approval from both teams
CODEOWNERSenforces reviews
# CODEOWNERS
/app/payments/ @payments-team
/app/onboarding/ @growth-team
/app/shared/ @payments-team @growth-team
Result: Team conflicts dropped to near-zero. Clear ownership. PR reviews enforced by GitHub.
Week 3: Interface Contracts
For the few places where modules truly needed to communicate, we defined explicit contracts.
# app/shared/interfaces.py
from abc import ABC, abstractmethod
from dataclasses import dataclass
@dataclass
class PaymentResult:
success: bool
transaction_id: str | None
error_message: str | None
class PaymentServiceInterface(ABC):
@abstractmethod
def charge(self, user_id: str, amount_cents: int) -> PaymentResult:
"""Charge a user. Returns PaymentResult."""
pass
# app/payments/services.py
class PaymentService(PaymentServiceInterface):
def charge(self, user_id: str, amount_cents: int) -> PaymentResult:
# Implementation
...
# app/onboarding/services.py
from app.shared.interfaces import PaymentServiceInterface
class OnboardingService:
def __init__(self, payment_service: PaymentServiceInterface):
self._payment = payment_service
def complete_signup(self, user_id: str, plan: str):
# Use the interface, not the implementation
result = self._payment.charge(user_id, plan.price_cents)
if not result.success:
raise PaymentFailedError(result.error_message)
This is the microservices benefit (explicit contracts, independent development) without the overhead (network calls, deployment coordination, infrastructure).
Six Months Later
The results speak for themselves.
Deployment Frequency
- Before: Weekly (scared)
- After: Daily (confident)
Incident Rate
- Before: 1-2 outages/month
- After: 1 outage in 6 months (unrelated to architecture)
Feature Velocity
- Before: 2-3 features/sprint
- After: 5-6 features/sprint
Infrastructure Cost
- Before: $8K/month
- After: $8.5K/month (+$500 for read replica and feature flags)
Engineering Headcount
- Before: 8 engineers
- After: 8 engineers (no infrastructure team needed)
Revenue
- Before: $3M ARR
- After: $12M ARR (12 months later)
They didn't need microservices. They needed discipline.
When Microservices Actually Make Sense
I'm not anti-microservices. I'm anti-premature-microservices.
Microservices make sense when:
1. You have 50+ engineers
At that scale, communication overhead dominates. Microservices let teams work independently. Below 50, the overhead isn't worth it.
2. Services have genuinely different scaling requirements
If your payment processing needs 10x the compute of user management, separate services make sense. But "might need to scale differently someday" isn't a reason.
3. You need different technology stacks
If your ML team needs Python and your API team needs Go, microservices let them coexist. But if everyone uses Python, a monolith is simpler.
4. You have dedicated platform engineering
Microservices require infrastructure: service discovery, distributed tracing, log aggregation, deployment orchestration. Someone has to build and maintain that. If you don't have a platform team, you're signing your product engineers up for infrastructure work.
5. You're breaking up a genuinely problematic monolith
Sometimes monoliths become unmaintainable. But "unmaintainable" means: deploy takes hours, tests take hours, no one understands the full system. Not: "we have some merge conflicts."
Signs you don't need microservices:
- Team size under 30
- Deploy takes under 30 minutes
- Single business domain
- No dedicated platform engineering
- Scaling is handled by vertical scaling or read replicas
- Problems can be solved with feature flags, better testing, or code organization
The Conversation I Have Too Often
Here's how the microservices conversation usually goes:
Startup: "We need to migrate to microservices."
Me: "Why?"
Startup: "Deployments are risky."
Me: "Have you tried feature flags?"
Startup: "No, but microservices would..."
Me: "Have you tried feature flags?"
Startup: "... No."
Me: "Let's try feature flags."
Two weeks later, the "problem" is solved. Six months of engineering time saved.
The same conversation happens with:
- "We need Kubernetes" → Have you tried a managed container service?
- "We need event sourcing" → Have you tried a transaction log?
- "We need GraphQL" → Have you tried REST with sparse fieldsets?
The pattern: complex solutions to simple problems.
The $500K They Didn't Spend
Let's total it up:
| Avoided Cost | Amount |
|---|---|
| Engineering time (6 months × 4 engineers) | $300K |
| Infrastructure overhead (12 months) | $60-120K |
| Opportunity cost (features not shipped) | $100K+ |
| Total saved | $460-520K+ |
And that's conservative. The real cost of shipping 4x slower for a year is incalculable in a competitive market.
The Lesson
The best architecture is the simplest one that solves your actual problems.
Not the problems you might have at 10x scale. Not the problems Netflix has. Not the problems the conference speaker had.
Your actual problems. Today.
When I work with startups, the first question is always: "What problem are we actually solving?" The second question is: "What's the simplest solution that solves it?"
Usually, the answer isn't a 6-month migration. It's a 3-week improvement to what you already have.
Considering a major architecture change? Before you commit, let's talk. I've helped startups avoid expensive migrations and find simpler solutions to their scaling challenges. Sometimes you need microservices. Usually, you need better use of what you have.
- Technical Advisor for Startups ... Architecture review and guidance
- Full-Stack Development for Startups ... Building scalable monoliths
Continue Reading
This post is part of the SaaS Architecture Decision Framework ... covering multi-tenancy, deployment models, database scaling, and cost optimization from MVP to $1M ARR.
More in This Series
- Multi-Tenancy with Prisma & RLS ... Database isolation patterns
- Zero to 10K MRR SaaS Playbook ... Early-stage architecture
- Boring Technology Wins ... Technology selection philosophy
- Tech Stack as Capital Allocation ... Making stack decisions like investments
Ready to make better architecture decisions? Work with me on your SaaS architecture.
