How I Think

Opinions from 28 years of production systems—backed by code, not slides. How I approach architecture, privacy, and building systems that operate at scale.

Constraint-Driven Design

Hard constraints don't limit architecture—they clarify it. Start with the constraint that scares you.

The best architectural decisions I've made started with a constraint that felt impossible. Not a preference, not a goal—a hard line that couldn't be crossed.

Constraints force clarity

When I built Hive, the constraint was clear: manage multiple Docker Swarm clusters without ever exposing their APIs externally. No inbound firewall rules. No VPNs. No exceptions.

That constraint killed the obvious approach (direct API calls) and forced something better: an agent architecture where connections are outbound-only, jobs are cryptographically signed, and a compromised control plane can't directly touch running containers.

The architecture didn't emerge despite the constraint. It emerged because of it.

Three constraints that shaped three platforms

Hive: "Multi-cluster management without exposing Docker APIs" → Agent architecture with outbound-only connections. Blast radius limited by design.

Boditree: "Zero-knowledge encryption—server never sees plaintext" → Client-side encryption with device-owned keys. Privacy isn't a policy; it's a mathematical guarantee.

Locactus: "You don't scroll, you walk" → Boom moments (contextual notifications) over feeds. The app disappears until it matters.

The question to ask

Before sketching boxes and arrows, ask: "What's the constraint that would make this trivial approach impossible?"

If you don't have one, you probably haven't understood the problem deeply enough. Find the constraint. Let it force the architecture.

Observability is Infrastructure

You can't manage what you can't see. That's not a slogan—it's an architecture decision.

Observability isn't a feature you add after launch. It's infrastructure you build from day one—and it probably needs more than one database.

One database isn't enough

Hive uses five data stores, each optimized for a different observability concern:

- MongoDB for audit trails and deployment history - Prometheus for time-series metrics - Elasticsearch for searchable logs - Neo4j for service dependency graphs - RabbitMQ for job coordination

Could I have crammed everything into Postgres? Technically. But a graph query ("what breaks if this node dies?") and a time-series query ("what was CPU usage at 3am?") are fundamentally different operations. Using the right tool for each concern isn't over-engineering—it's acknowledging that observability has multiple dimensions.

The three questions

Every system I build needs to answer:

1. What's happening right now? (Metrics) 2. What happened before? (Logs, audit trail) 3. What depends on what? (Relationships)

If you can't answer all three, you're not operating the system—you're hoping it keeps running.

Build visibility before you need to debug

The worst time to instrument your system is during an incident. The second worst time is "after launch." Plan for observability in the architecture phase, not the "hardening" phase.

Observability isn't overhead. It's the difference between "this service is running" and "I understand what this service is doing."

Make Invisible Things Visible

Systems fail silently. Build tools that surface what's actually happening—before someone asks.

The most dangerous state in any system is "silently wrong." Running but drifting. Deployed but misconfigured. Healthy according to the health check, broken according to users.

Drift detection changed how I think

Hive's drift detection compares desired state (what you said you wanted) against actual state (what's running). The gap between them is drift—and it's usually invisible until something breaks.

Before drift detection, I'd deploy a service and trust it stayed configured correctly. After drift detection, I could see when someone manually edited a container, when a secret expired, when a replica count drifted from spec. The system surfaced what I couldn't see by staring at logs.

Make the implicit explicit

Boditree's Growth Tree turns private journal data into visible patterns. Users don't just write entries—they see roots (recurring themes), branches (categories), and leaves (individual entries). The metaphor makes the invisible visible.

Locactus makes interest vectors explicit. Users see why they're being recommended something—not a black box, but a transparent match between their interests and an event's tags.

The question I ask every project

"What's true about this system that no one can see without asking?"

Then I build a dashboard, a graph, a detection system—something that surfaces that truth automatically. Because if you can only see it by asking, you won't ask until it's too late.

The difference between "running" and "operating" is visibility into state.

Separation of Concerns at Scale

Work at the right level of abstraction. Delegate everything else ruthlessly.

The hardest architectural skill isn't building things. It's knowing which things your layer should own—and which things should belong to something else.

Work at the right level

Hive operates exclusively at the service level. It never touches individual containers. Why? Because Docker Swarm already handles container scheduling, replication, and health checks. If I duplicated that logic, I'd be fighting the tool instead of leveraging it.

The principle: "Hive defines desired state. Swarm enforces it."

That separation means Hive can't micromanage containers—which sounds limiting until you realize it also means Hive can't break Swarm's scheduling, can't conflict with its health checks, can't create edge cases that only appear when both systems disagree.

Each service owns one truth

In Locactus, the Location Engine knows nothing about interests. The Orchestration Service knows nothing about event storage. Each service owns exactly one concern:

- Location Engine: "Where is this user?" - Interest Service: "What does this user care about?" - Orchestration: "Does this user + this event = a match?"

When concerns bleed across boundaries, debugging becomes archaeology. When boundaries are clean, debugging becomes binary: either this service did its job or it didn't.

The question

Before adding functionality to an existing service, ask: "Should this layer own this concern?"

If the answer involves "well, it's convenient because..." you're probably violating a boundary. Convenience now is complexity later.

Privacy as Architecture, Not Compliance

Build systems that CAN'T surveil, even if compromised. If privacy depends on policy, it's permission.

Most "privacy-first" systems are actually "privacy-by-policy" systems. The data exists, the access is possible, and privacy depends on someone choosing not to look.

That's not privacy. That's permission.

Build systems that can't surveil

Boditree uses zero-knowledge architecture. The server stores encrypted journal entries but never holds the decryption keys—those live on the user's device. If we're breached, attackers get encrypted blobs. If a rogue employee wants to read journals, they can't. The architecture makes surveillance impossible, not just prohibited.

The trade-off is real: we can't do server-side AI analysis on journal content (yet). But "privacy-first" means accepting constraints, not routing around them.

Privacy is also about dignity

Technical privacy is necessary but not sufficient. Boditree will never show "Your partner hasn't journaled in 5 days." That's engagement shaming disguised as a feature. Some couples journal asymmetrically—one partner writes daily, the other writes monthly. The system shouldn't weaponize that difference.

Locactus never resells raw location data. The business model doesn't depend on surveillance capitalism. Users choose their visibility level (Private/Friends/Public) explicitly—no dark patterns, no defaults that benefit the platform.

The test

Ask: "If our database leaked tomorrow, what would attackers learn about our users?"

If the answer is "everything they ever did," you've built surveillance infrastructure with a privacy policy draped over it. If the answer is "encrypted blobs and hashed identifiers," you've built privacy as architecture.

Policy can be changed. Architecture can't—at least not easily.

Scope as Strategy

What you choose NOT to build defines your architecture as much as what you build.

Every PRD I write has a "Non-Goals" section. It's not a formality—it's the most strategic part of the document.

Non-goals prevent drift

Hive v1 explicitly excluded: multi-region deployment, external message queue clustering, custom scheduler implementation, and guaranteed message ordering across nodes.

Were those features useful? Absolutely. Could we have built them? Probably. But listing them as non-goals did something important: it gave the team permission to say "no" without having the argument every sprint.

Non-goals aren't "things we don't want." They're "things we're deliberately not doing in this phase so we can do other things well."

Density before breadth

Locactus is launching in five UK cities, not globally. The explicit strategy: "Better to own Shoreditch than sprinkle across London."

Social platforms live or die on density. A thousand users spread across fifty cities is worthless. A thousand users concentrated in one neighborhood creates network effects. The non-goal (global launch) enables the goal (local density).

Phase plans build on telemetry, not wishes

Boditree's MVP launches with basic on-device NLP. Phase 2 adds Trusted Execution Environments for better AI analysis. But Phase 2 doesn't start until Phase 1 gives us telemetry on what users actually want.

Too many roadmaps are wish lists. Good roadmaps are hypotheses: "If Phase 1 shows X, we'll build Y in Phase 2." The scope of future phases depends on what you learn, not what you imagined.

The discipline

For every feature you add to a PRD, ask: "What are we NOT building so we can build this?"

If you can't answer, you haven't made a decision. You've just made a list.

Want to discuss these ideas?

I'm always happy to have conversations about platform engineering, architecture decisions, and how to build systems that last.

Start a Conversation