

Stop Letting AI Agents Guess Where Code Goes
A practical guide to structuring your codebase so AI agents stop creating architectural debt.

From Craftsman to Toolmaker: Where Value Concentrates Now
What Developers Actually Get Paid For Now

Why Spec-Driven Development Doesn't Solve Variance
Part 2 of 5: Organizational Structures for AI-Native Development Part 1 established that "epic-sized" work units create variance explosions. The standard response from engineering leaders is: "We need better specifications upfront.
<100 subscribers
Part 4 of 5: Organizational Structures for AI-Native Development
If you ask most CFOs whether three senior engineers should work on one feature together, the instinctive answer is no.
It sounds like waste.
That reaction makes sense if coding is the bottleneck. But in AI-native teams, code generation is the fastest part of the loop. Hidden variance is what actually slows delivery down—and mob sessions are one of the few tools that reduce it early.
Most teams still use a simple cost model:
cost = person-hours
That model misses the expensive part. A better one:
total feature cost = labor + coordination + delay + rework
Labor is visible. Coordination, delay, and rework are invisible until the end of the sprint.
When a "simple" feature suddenly takes five days, it's rarely because a developer typed slowly. It's because the team discovered uncertainty too late.
AI made this worse, not better. One developer can produce implementation-grade code in minutes. That compresses the coding phase and exposes the real bottleneck: decision quality under uncertainty. The faster code appears, the faster hidden assumptions collide with reality—the data model doesn't support the use case, the external API behaves differently in production, two modules use different domain language for the same concept.
Those aren't code speed problems. They're variance problems.
Last quarter, one of our teams needed to build a reporting feature that pulled data from three different modules. Seemed straightforward. A senior developer took it solo.
Day 1: implementation started, architectural friction appeared—the data access patterns across modules were inconsistent. Day 2: async Slack thread with another engineer, direction updated. Day 3: integration issue appeared when the module interfaces didn't compose cleanly. Day 4: another "quick sync" with the domain expert. Day 5: final implementation plus cleanup of the rework from day 2's wrong turn.
Five calendar days. The actual coding? Maybe six hours total. The rest was waiting, coordinating, and redoing work that was built on assumptions that turned out to be wrong.
Two weeks later, a similar cross-module feature came up. This time the team mobbed it. Three developers, one session.
Hour 1: the same kind of architectural friction appeared—immediately. But the person who owned the other module was sitting right there. Decision made in ten minutes instead of a day. Hours 2-5: implementation with real-time validation. Hour 6: integration completed and tested.
Same person-hours. One calendar day instead of five. Zero rework.
The mob didn't make anyone code faster. It front-loaded coordination so the team paid the uncertainty cost once, not repeatedly.

Do not mob everything. The signal isn't "this is hard." It's "this is uncertain."
Solo is usually better when the problem is well understood, boundaries are clear, failure blast radius is low, and one person can explain the full solution quickly.
Mob is usually better when multiple systems must compose in a new way, requirements emerge during implementation, architectural choices affect multiple domains, or wrong-path cost is high.
This is not "simple vs complex." This is "low variance vs high variance."
The bad stereotype: three people watching one person type for six hours.
Here's what actually happens in a well-run mob. The driver controls the keyboard and AI prompts. Two navigators challenge assumptions and catch risks in real time. Driver rotates every 45-60 minutes so nobody zones out. The session starts with a hard objective and exit criteria—"we're done when the payment reconciliation module talks to the reporting module through the public API."
The critical distinction: if you leave with only action items, you ran a meeting. If you leave with integrated code and resolved decisions, you ran a mob session.
I've seen teams confuse the two for months. They sit together, talk through the problem, write up next steps, and go back to solo implementation. That's a planning meeting with extra chairs. A mob session produces working code. That's the bar.
The strongest teams combine both modes:
Solo for bounded implementation
Mob for high-variance integration points
Weekly integration checkpoints to catch drift early
The checkpoint question that matters isn't "what did you do?" It's: "What uncertainty did we discover while building?"
That one question surfaces the work that needs coordination before it becomes rework.

Mob sessions usually fail for cultural reasons, not technical ones.
Traditional incentives reward visible individual output. Mob sessions optimize team-level outcomes. So teams get stuck in a contradiction: they say they want faster delivery and fewer integration surprises, but they measure utilization and individual throughput. They get exactly what they measure—local optimization, global delay.
If you want mob sessions to work, measure what they actually improve:
Lead time — idea to production
Rework rate — changes needed after merge
Interrupt count — "quick questions" across the team per week
Post-integration defects
Architectural drift incidents
If lead time drops and rework drops, the model is working—even if utilization optics look less "efficient." The CFO who sees three engineers on one feature and calls it waste is looking at labor cost. The CFO who sees one-day delivery instead of five is looking at total cost. Same data, different lens.
Do not roll this out as doctrine. Run an experiment.
Pick 2-3 features with obvious cross-system uncertainty.
Run focused mob sessions only at decision-heavy points.
Keep solo execution for bounded tasks.
Compare lead time and rework against your recent baseline.
Keep what works, drop what doesn't.
Treat mob sessions as a precision instrument, not a religion.
The wrong question is: "Is three people on one feature efficient?"
The right question is: "What are we optimizing for?"
If you optimize for visible individual utilization, mob looks expensive. If you optimize for elapsed time, decision quality, and architectural coherence, mob is often the cheaper path.
In AI-native development, code is abundant. Correct decisions under uncertainty are scarce.
That is exactly what mob sessions buy.
Where has hidden variance hurt your team most lately—architecture, integration, or requirements discovery?
Coming in Part 5: We've covered process debt, intent articulation bottlenecks, economic shifts, and coordination patterns. The final question: how do you actually transform an organization? Not with a reorganization or a new process framework—with pocket deployments, documented failures, and explicit opt-in. We'll close with a practical roadmap for leaders ready to make the shift.

Part 4 of 5: Organizational Structures for AI-Native Development
If you ask most CFOs whether three senior engineers should work on one feature together, the instinctive answer is no.
It sounds like waste.
That reaction makes sense if coding is the bottleneck. But in AI-native teams, code generation is the fastest part of the loop. Hidden variance is what actually slows delivery down—and mob sessions are one of the few tools that reduce it early.
Most teams still use a simple cost model:
cost = person-hours
That model misses the expensive part. A better one:
total feature cost = labor + coordination + delay + rework
Labor is visible. Coordination, delay, and rework are invisible until the end of the sprint.
When a "simple" feature suddenly takes five days, it's rarely because a developer typed slowly. It's because the team discovered uncertainty too late.
AI made this worse, not better. One developer can produce implementation-grade code in minutes. That compresses the coding phase and exposes the real bottleneck: decision quality under uncertainty. The faster code appears, the faster hidden assumptions collide with reality—the data model doesn't support the use case, the external API behaves differently in production, two modules use different domain language for the same concept.
Those aren't code speed problems. They're variance problems.
Last quarter, one of our teams needed to build a reporting feature that pulled data from three different modules. Seemed straightforward. A senior developer took it solo.
Day 1: implementation started, architectural friction appeared—the data access patterns across modules were inconsistent. Day 2: async Slack thread with another engineer, direction updated. Day 3: integration issue appeared when the module interfaces didn't compose cleanly. Day 4: another "quick sync" with the domain expert. Day 5: final implementation plus cleanup of the rework from day 2's wrong turn.
Five calendar days. The actual coding? Maybe six hours total. The rest was waiting, coordinating, and redoing work that was built on assumptions that turned out to be wrong.
Two weeks later, a similar cross-module feature came up. This time the team mobbed it. Three developers, one session.
Hour 1: the same kind of architectural friction appeared—immediately. But the person who owned the other module was sitting right there. Decision made in ten minutes instead of a day. Hours 2-5: implementation with real-time validation. Hour 6: integration completed and tested.
Same person-hours. One calendar day instead of five. Zero rework.
The mob didn't make anyone code faster. It front-loaded coordination so the team paid the uncertainty cost once, not repeatedly.

Do not mob everything. The signal isn't "this is hard." It's "this is uncertain."
Solo is usually better when the problem is well understood, boundaries are clear, failure blast radius is low, and one person can explain the full solution quickly.
Mob is usually better when multiple systems must compose in a new way, requirements emerge during implementation, architectural choices affect multiple domains, or wrong-path cost is high.
This is not "simple vs complex." This is "low variance vs high variance."
The bad stereotype: three people watching one person type for six hours.
Here's what actually happens in a well-run mob. The driver controls the keyboard and AI prompts. Two navigators challenge assumptions and catch risks in real time. Driver rotates every 45-60 minutes so nobody zones out. The session starts with a hard objective and exit criteria—"we're done when the payment reconciliation module talks to the reporting module through the public API."
The critical distinction: if you leave with only action items, you ran a meeting. If you leave with integrated code and resolved decisions, you ran a mob session.
I've seen teams confuse the two for months. They sit together, talk through the problem, write up next steps, and go back to solo implementation. That's a planning meeting with extra chairs. A mob session produces working code. That's the bar.
The strongest teams combine both modes:
Solo for bounded implementation
Mob for high-variance integration points
Weekly integration checkpoints to catch drift early
The checkpoint question that matters isn't "what did you do?" It's: "What uncertainty did we discover while building?"
That one question surfaces the work that needs coordination before it becomes rework.

Mob sessions usually fail for cultural reasons, not technical ones.
Traditional incentives reward visible individual output. Mob sessions optimize team-level outcomes. So teams get stuck in a contradiction: they say they want faster delivery and fewer integration surprises, but they measure utilization and individual throughput. They get exactly what they measure—local optimization, global delay.
If you want mob sessions to work, measure what they actually improve:
Lead time — idea to production
Rework rate — changes needed after merge
Interrupt count — "quick questions" across the team per week
Post-integration defects
Architectural drift incidents
If lead time drops and rework drops, the model is working—even if utilization optics look less "efficient." The CFO who sees three engineers on one feature and calls it waste is looking at labor cost. The CFO who sees one-day delivery instead of five is looking at total cost. Same data, different lens.
Do not roll this out as doctrine. Run an experiment.
Pick 2-3 features with obvious cross-system uncertainty.
Run focused mob sessions only at decision-heavy points.
Keep solo execution for bounded tasks.
Compare lead time and rework against your recent baseline.
Keep what works, drop what doesn't.
Treat mob sessions as a precision instrument, not a religion.
The wrong question is: "Is three people on one feature efficient?"
The right question is: "What are we optimizing for?"
If you optimize for visible individual utilization, mob looks expensive. If you optimize for elapsed time, decision quality, and architectural coherence, mob is often the cheaper path.
In AI-native development, code is abundant. Correct decisions under uncertainty are scarce.
That is exactly what mob sessions buy.
Where has hidden variance hurt your team most lately—architecture, integration, or requirements discovery?
Coming in Part 5: We've covered process debt, intent articulation bottlenecks, economic shifts, and coordination patterns. The final question: how do you actually transform an organization? Not with a reorganization or a new process framework—with pocket deployments, documented failures, and explicit opt-in. We'll close with a practical roadmap for leaders ready to make the shift.

Stop Letting AI Agents Guess Where Code Goes
A practical guide to structuring your codebase so AI agents stop creating architectural debt.

From Craftsman to Toolmaker: Where Value Concentrates Now
What Developers Actually Get Paid For Now

Why Spec-Driven Development Doesn't Solve Variance
Part 2 of 5: Organizational Structures for AI-Native Development Part 1 established that "epic-sized" work units create variance explosions. The standard response from engineering leaders is: "We need better specifications upfront.
Share Dialog
Share Dialog
No comments yet