The traffic light that didn't exist

This happened. I was there.

A Physical AI team I worked with needed to demonstrate autonomous driving through a traffic-light-controlled intersection. At the start of the sprint, the systems & safety team asked the autonomy team for a couple of days to jointly work through the system diagram, do hazard analysis and derive key requirements. The autonomy team’s response: We know what the requirement is. Stop at red, go at green. Why do we need to spend days discussing this? We’ll just get to work.

So they did. The safety team drafted requirements independently. The autonomy team wrote code independently. Two weeks later, the car ran a red light. Then it ran another one. Two separate incidents, two completely different failure modes, both traceable to the same upstream decision not to sit down together.

This is what organizational debt looks like when it comes due.


Organizational debt is the accumulated misalignment — of beliefs, priorities, working assumptions, and undisclosed decisions — that builds up silently across teams during rapid development. Unlike technical debt, which is at least partially visible (the refactoring backlog, the verification gaps, the interfaces hacked together under schedule pressure), organizational debt is invisible until it isn’t. Teams believe they are aligned because no one has surfaced the disagreement. Decisions are being made continuously, by individuals, without the people who have a stake in those decisions even knowing they were made. Artifacts that should be jointly owned are authored in isolation. The cost of all this accumulates silently, until the bill arrives without warning.


Two incidents, one root cause

The first incident: Someone had updated a configuration parameter — the modeled length of the vehicle — without informing other teams. When the car came to a full stop at a red light, its front bumper now extended a couple of feet beyond the white stop line. From the autonomous system’s perspective, the car had entered the intersection and stopped there. The system’s logic for that situation was to clear the intersection so as not to block cross-traffic. So the car — stationary at a red light — drove forward into the intersection.

Nobody had actually defined what it meant to stop at a red light. Nobody had asked: what does it mean to be stranded in an intersection, and under what circumstances should the car clear it? The autonomy team assumed stop at red is self-evident and coded accordingly. The safety team drafted requirements without knowing the behavioral logic that was actually being implemented. “Stop at red, go at green” conceals a web of behavioral decisions that tend to stay hidden when teams work in parallel. Joint analysis surfaces things that parallel analysis misses.

I want to be careful about what lesson to draw, though. The autonomy team would be right to point out that this specific edge case — a config parameter change causing the car to misidentify its position relative to a stop line — is precisely the kind of thing that is hard to anticipate in a requirements session. The failure here was not that the teams failed to imagine this scenario. The failure was that a configuration change with system-wide behavioral implications happened with no visibility across teams, and with no forum where the question would have been raised: does this affect any existing logic? That is an organizational and process failure, not a requirements failure.

The second incident: The car was waiting at a red light when a bus crossed the intersection laterally, moving slowly enough that for an extended period it blocked the car’s sensors from seeing the traffic light. After some time without a visible signal, the autonomous system concluded there was no traffic light. It began moving into the intersection.

There were no requirements around this behavior. The decision to treat a persistently occluded signal as a non-existent one had been made by a programmer, embedded in C++ code, and was entirely unknown to the rest of the organization. Not because that programmer was careless — they used their judgment for solving the problem in front of them. But the organizational processes had no mechanism to surface that decision, review it, or ask whether anyone else had a stake in it.

These are not questions a programmer should be resolving alone while writing the code. Even when requirements exist, writing the code that implements them surfaces ambiguities and edge cases. There need to be processes for harvesting and resolving them in the right forum — not leaving in-the-moment decisions buried silently in the code. The shift toward AI-heavy and end-to-end stacks changes the vocabulary of these decisions, not the underlying pattern: somewhere in the organization, a consequential choice is being made about system behavior, and the question is whether the right people know about it and have had a chance to influence it.

The case for moving fast

The autonomy team’s argument for skipping the joint requirements session is not foolish. If the aim is to rapidly develop a capability for an upcoming demo, the signal about what works, what doesn’t, and what needs improving can plausibly be obtained faster by deploying early and iterating than by spending days drafting requirements. The real world surfaces edge cases that no whiteboard session would have found. Why produce a list that will certainly be incomplete?

This argument has merit in software products where failures are recoverable. In Physical AI, the failures happen in the real world, with consequences that are not a bug report. But there is a subtler problem with the argument, beyond the obvious one — a form of survivorship bias that is easy to miss.

The edge cases that a team identifies and addresses during development may not manifest in the real world — because they were prevented. They are invisible. What is visible is only what got through: the incidents, the near-misses, the failures. For every edge case that made it through to a real-world incident, the team may have anticipated and resolved ten others during development. It is incorrect to look at the one incident that got through and conclude that the hazard analysis and requirements barely helped.

There is also a compounding dimension that the “just iterate” argument misses. The first time teams sit down to jointly identify hazards and develop requirements, they will not be good at it. The session will be imperfect, hazards will be missed, the requirements structuring may be sub-optimal, and some of what they produce will be wrong. This is expected. What matters is that by doing it, they build a shared methodology — and a shared understanding of how to improve it. When an edge case surfaces in the real world that the team had missed, a mature organization asks: why did we miss this? What in our analysis, requirements, testing, or validation processes would need to change to catch this class of problem earlier? That question is only available to teams that have processes to interrogate. Teams that skipped the session have nothing to iterate on.

If teams never start working together because they believe it won’t unearth the edge cases that immediately matter, they never develop the crucial capability to get better at working together.

Organizational debt accumulates rationally

I want to be precise about this, because the standard diagnosis gets it wrong. Organizational debt does not accumulate because teams are dysfunctional or because leaders are negligent. It accumulates because teams are doing exactly what they’re supposed to do, in the absence of processes that force them to do it together.

The autonomy team is chasing capability — that is their job. The systems team is trying to lock in architecture and system boundaries, because downstream analyses depend on them. The safety team is building a safety concept and working out what evidence will eventually be required. The legal team is asking what happens when the AI encounters the specific scenarios currently circulating in regulatory discussions. Every team is doing necessary work. Every team is under real pressure. And in the absence of deliberately designed inter-team processes, the rational default for each of them is to optimize locally — to solve what’s in front of them and give partial service to what other teams need.

This is not negligence. It is the predictable output of an organizational system with no forcing function for collaboration.

When leaders get this wrong — and right

Most leaders understand, in the abstract, that inter-team processes matter. What they consistently underestimate is the sustained commitment required to build and maintain those processes under delivery pressure.

I have watched two versions of this play out.

In the first version, the leader tells the systems and safety teams that their work is critically important. When those team leads ask for his support in getting the autonomy team to actually engage, he asks them to hold on — the autonomy team is heads-down on an investor demo, and once that’s done, they’ll engage. They never do. The next milestone arrives with the same urgency and the same logic. The leader was not dishonest. He was managing short-term pressure one decision at a time, each individually defensible. The cumulative effect was that safety became a best-effort rather than rigorous practice.

In the second version, the CEO holds that the most important outcome — more important than the quality of any individual artifact — is building the organizational muscle for how teams work together. He makes clear, before the first demo and for every subsequent deployment, that every milestone will be accompanied by a systems analysis and a safety case, however lightweight. He explicitly notes that those artifacts are not expected to be production-quality. Rather, the expectation is that the organization needs to get into the groove — teams working together on shared artifacts, reviewing them, iterating, improving the process itself. When the autonomy team pushes back — safety and systems are slowing us down — he acknowledges the friction and holds the line anyway. He explains what we are actually building: not just a capable system, but an organization that knows how to produce a safe one.

That organization eventually moves fast. When genuine pressure arrives, the inter-team processes do not collapse — they are the thing that allows the team to move quickly without losing integrity.

The difference between these two leaders is not their understanding of the problem. It is their willingness to absorb short-term friction in service of long-term organizational integrity. That is a harder thing to do than it sounds, because the short-term friction is visible and the long-term benefit is not.

The superhero trap

When organizational debt finally surfaces as a crisis, most programs respond the same way: find a senior person with broad credibility and institutional authority, and ask them to sort it out. This works. The senior person does sort it out, because they have the relationships and standing to cut across team boundaries and force alignment.

But they almost never do it by fixing the processes. They do it by being exceptional. And once the crisis passes, the organization returns to its previous ways of working, because those ways were never actually changed. The next crisis is already accumulating.

This pattern is seductive because it feels like a solution. It resolves the immediate problem and produces a hero narrative. What it does not do is build the institutional capability to avoid the next crisis.

The CEO in the second story understands this. He is not trying to avoid all crises. He is trying to build an organization that can defuse them without requiring a superhero — because superheroes are not scalable, and at-scale deployment is the whole point.

What actually fixes it

The fix is not an offsite, or a workshop, or a new standing meeting between teams. It is designing inter-team ways of working with the same deliberateness that engineers bring to system interfaces.

This means specifying a development process down to the sprint level: what each team will produce, what dependencies exist, which collaborative analyses are required, and which questions need a joint answer before the sprint closes. It means treating the first version of those processes as a draft, expected to be imperfect and revised. And it means that when the processes fail, as they will, the response is to interrogate the process and improve it, not abandon it.

The standard resistance to this is the accusation that any attempt to formalize inter-team collaboration is imposing a Waterfall model — a particularly effective critique in organizations that have internalized agile development as an identity. It is not entirely wrong. A rigid, stage-gated process would be the wrong answer. But using that critique to reject any attempt at structured collaboration is precisely how the debt compounds. The real challenge is building processes that are genuinely agile, ones that create the necessary coordination without killing the speed that makes iteration valuable.

The teams that get through this are not the ones that found the right process on the first try. They are the ones whose leaders made it unambiguous that working out these processes was part of the job, not a tax on real work but a component of it, and held that line long enough for the organization to internalize it. That is how an organization eventually gets to a place where a car doesn’t drive through a red light because a bus was in the way, because someone, at some point during development, asked: what happens when we can’t see the light?

Organizational debt will eventually come due. The question is whether you address it deliberately, on your own terms, or whether it forces your hand when you can least afford it.