When people say AI will flatten companies, they usually mean fewer managers.
That misses what managers actually do in most organizations.
Middle management is often the execution runtime: the place where unclear priorities get serialized, missing context gets reconstructed, blocked work gets rerouted, and failed decisions get retried without a formal incident review.
Some of that work is status. Some of it is judgment. A lot of it is permissioning, memory, escalation, and repair.
If AI removes the layer without replacing the runtime, the company does not become flatter.
It becomes lossy.
Jack Dorsey and Roelof Botha’s “From Hierarchy to Intelligence” is interesting for that reason. The media reading is obvious: AI is coming for middle management. The better reading is architectural: the org chart was not just a reporting structure. It was an information-routing system.
Block is the company behind Square, Cash App, Afterpay, TIDAL, bitkey, and proto. Its essay argues that large organizations added layers because humans have limited span of control. As companies grow, information has to move up, down, and sideways. Humans became the coordination mechanism.
The provocative claim is that AI can change that constraint.
Maybe. But the production question is not whether managers disappear. The production question is where the runtime goes.
The work did not disappear
That runtime was not clean.
It lived in one-on-ones, Slack backchannels, planning meetings, escalation paths, customer calls, and the memories of people who knew why the last migration failed.
It was slow, political, uneven, and often frustrating. But it carried real state. It knew which customer promise mattered more than the roadmap, who could approve an exception, which team was about to collide with another, and when a plan needed to be repaired instead of defended.
This is the part many “flatter company” arguments skip.
The real unit of analysis is not the manager. It is the coordination work the manager was absorbing. You can remove the managerial layer, but the work does not disappear. It either becomes explicit machinery, or it leaks into meetings, stalled decisions, duplicated work, unsafe autonomy, and incidents nobody saw coming.
The same thing happens in software. If you remove the database because it feels heavyweight, state does not disappear. It leaks into logs, caches, browser sessions, duplicated API calls, and tribal knowledge. The system becomes simpler on the diagram and worse in production.
A flatter AI company has the same risk. Removing middle management only works if the work middle management performed reappears as explicit, inspectable system machinery.
The replacement is runtime infrastructure
A useful replacement for middle management would not be “more AI in the workflow.”
AI can summarize, retrieve, route, draft, compare, monitor, and execute. Those capabilities matter. But they do not automatically become a company runtime. Without explicit state and ownership, they become a faster way to create ambiguity.
If a company removes managers, it still has to answer the same operational questions: where priorities are stored, who is allowed to change them, how exceptions are routed, how conflicts are resolved, how decisions are audited, and who owns the outcome when an automated workflow does the wrong thing.
The replacement looks more like runtime infrastructure:
- shared state with provenance, freshness, and scope
- permission boundaries for who or what can read, decide, and mutate
- ownership for decisions, exceptions, and agent actions
- observability into work, risk, drift, and failure
- recovery paths when plans, customers, dependencies, or models change
- evaluation loops that turn outcomes back into corrected behavior
That is the bar: not whether the company has fewer managers, but whether the hidden coordination layer has become explicit enough to operate, inspect, and repair.
If the answer is “the model,” you do not have a system. You have a demo of a chatbot reading Slack.
Block and YC are pointing at the same missing layer
The useful part of Block’s thesis is not the labor-market prediction. It is the decomposition.
Block proposes a company organized around financial capabilities, company and customer world models, an intelligence layer, product interfaces, and clearer human owners. In software terms, that is not just a new org chart. It is an execution architecture:
- tools and services
- operational memory
- customer state
- orchestration
- product surfaces
- accountable human owners
The YC version of the AI-native company makes a similar move from another angle: make the company queryable, automate internal functions, deploy agents across product, support, ops, sales, and marketing, and close the loop between action and feedback.
Those are useful instincts. They point at the same missing object: not a chatbot, not a dashboard, and not a pile of automations, but a company runtime.
Whether Block can execute this is a separate question. The essay itself says the company is early and that parts will likely break before they work. YC examples are also operator stories and strategy arguments, not neutral proof that every company can become a ten-person empire.
The useful move is not to score the prediction. It is to notice the architecture. “AI-native company” only becomes real when artifacts become state, state connects to tools, tools connect to owners, and outcomes feed back into decisions.
Queryable is not reliable
The phrase “company world model” is easy to overread.
In this context, the company world model is not a mystical set of model weights that understands the business. It is closer to governed operational memory: the artifacts, events, entities, permissions, decisions, and traces that let a system reconstruct what is happening.
The same is true for the “queryable company” framing. A company that can answer questions from Slack, Linear, GitHub, Notion, call recordings, dashboards, and customer notes is not automatically an AI-native organization. It may just be a better search appliance over organizational exhaust.
A company memory that can answer questions is only useful if the system knows what kind of answer it is giving:
- freshness: is this still true?
- provenance: where did it come from?
- scope: which team, customer, product, region, or time period does it apply to?
- permission: who is allowed to see it, act on it, or change it?
- mutability: what is the source of truth, and how is it updated?
- conflict: what happens when Slack, Linear, GitHub, and the CRM disagree?
- retention: when should this disappear?
- redaction: what should never become general company memory?
- correction: how does the system learn that an answer caused the wrong action?
Without those properties, a queryable company becomes a more confident version of Slack search. It can retrieve context, but it cannot tell you whether that context is current, authorized, complete, or safe to use.
That may still demo well. Ask it “what is blocking the launch?” and it will produce a plausible answer from docs, tickets, and chat. But production questions are sharper: did it use the current launch plan or last week’s draft? Did it treat a brainstorm as a decision? Did it rank the loudest Slack conversation above the committed roadmap? Did it expose sensitive context to an agent that only needed project status?
The model output is not the product boundary. The state boundary is.
Closed loops need execution state
A company runtime does not just answer, “What happened?” It has to carry work across time.
It needs to know what was intended, what was decided, who owns it, what action was taken, what approval was required, what changed in the world, and whether the result was acceptable.
A summary is not a closed loop.
A closed loop has state:
- intent: what are we trying to accomplish?
- decision: what did we choose, and why?
- owner: who is accountable for the outcome?
- action: what tool, agent, or person changed something?
- gate: what required approval before execution?
- result: what happened after the action?
- evaluation: did it meet the expected bar?
- correction: what changes in the system because of what we learned?
This is where Block’s failed-composition idea is valuable. If the intelligence layer tries to solve a customer moment and cannot because a capability is missing, that failure can become backlog evidence. That is a better input than the usual roadmap ritual where teams turn anecdotes, executive opinions, sales pressure, and dashboard anxiety into quarterly bets.
But this only works if the failure is observable. Without a trace, “the intelligence layer could not compose a solution” is just a vibe.
The same is true for every AI execution layer. It needs stop conditions, permission checks, approval gates, idempotency keys, retries, rollback paths, durable state transitions, audit logs, failure taxonomies, and human escalation paths.
Without that machinery, AI produces more artifacts: summaries, drafts, recommendations, tickets, dashboards. Useful, but not enough. The runtime starts when outcomes change the next decision, the next workflow, the next eval, or the next escalation path.
Custom agents create local runtimes
The problem compounds as companies add custom agents.
A support agent is not just a better helpdesk search box. It needs customer state, entitlement rules, refund authority, escalation boundaries, tone constraints, audit logs, and a way to turn bad answers into eval cases.
A sales agent needs account context, pricing permissions, CRM mutation rules, legal boundaries, handoff points, and memory of promises made during the deal.
A QA agent needs product state, test history, release risk, ownership, reproduction steps, and a path for failed checks to become backlog or eval evidence instead of disappearing into a report.
Each agent creates a local runtime: its own state, tools, permissions, failure modes, and human accountability boundary. Treat it like “automation” and it will leak. Treat it like infrastructure and you can inspect it, constrain it, recover it, and improve it.
This is why enterprise AI projects often fail when they bolt a chatbot onto old workflows. The missing part is not usually model fluency. It is the legible workflow state around the model: inputs, permissions, tools, approvals, outputs, feedback, and ownership.
A startup can sometimes build that faster because it has fewer legacy systems, fewer handoffs, and less political debt. But it does not get to skip the runtime. It just has a better chance of designing it before the mess calcifies.
Agents need execution models, not vibes. Companies do too.
Human accountability is still a boundary
A common mistake in AI-native org design is to replace management with ambient autonomy.
Everyone gets more context. Everyone has agents. Everyone can ask the company brain. Therefore, everyone can move faster.
Maybe. But autonomy without ownership creates unowned failure.
This is where DRIs matter. A DRI is not just a name in a tracker. It is the accountability boundary for a decision, an exception, or a workflow that affects customers, money, security, quality, or people.
If an agent takes an action, the question is not “which model did it?” The question is who owns the policy, the approval path, the evaluation, and the correction when it fails.
The same is true for player-coaches. In a flatter AI-native company, the best technical and functional leaders should spend less time moving status around and more time owning craft: setting quality bars, designing evals, reviewing edge cases, shaping workflows, and deciding what good looks like.
That is not middle management by another name. It is human judgment attached to the parts of the system where judgment still matters.
When execution gets cheaper, taste, review, and constraints become the bottleneck. A company with agents but no player-coaches will generate more output. It may also generate more cleanup, more rework, more inconsistent product decisions, and more silent drift.
AI can reduce human routing. It cannot remove accountable decision rights.
If nobody owns the decision, the organization did not become AI-native. It became unauditable.
The surveillance trap
There is a darker version of the company runtime.
Remote-first companies create artifacts. Meetings are recorded. Decisions live in docs. Code is in GitHub. Tickets are in Linear. Support conversations are in the CRM. Sales calls are transcribed. Slack never really forgets.
That makes the company easier for AI to model.
It also makes the company easier to surveil.
A company runtime needs state. That does not mean every human becomes a service emitting logs.
If the system captures work state without permission boundaries, purpose limits, retention rules, and redaction, it becomes surveillance infrastructure. People will route around it, poison it, or stop writing down the real context. The runtime gets worse because the organization made it unsafe to tell the truth.
Operational memory has to be designed differently from employee monitoring. It should make decisions, dependencies, risks, and recovery paths visible. It should not turn every draft, hesitation, disagreement, or private working note into permanent organizational evidence.
State without permissions is not intelligence. It is surveillance infrastructure.
The practical test
The useful question is not whether AI will replace middle managers.
The useful question is what functions your managers are currently performing because your systems do not.
For each team, ask:
- Where do decisions live after the meeting ends?
- Where do blocked tasks go before they become escalations?
- Who can change priority, scope, customer commitments, or policy?
- Which meetings or Slack threads update state that never reaches a system?
- Which human is acting as the routing table?
- Which agents can mutate systems of record?
- What requires approval before execution?
- Who owns the result when an agent is wrong?
- How are failures turned into evals, backlog, or workflow changes?
- What state is visible to whom, and for how long?
- Where do conflicting versions of the truth get resolved?
- What recovery path exists when the automation makes the wrong thing faster?
If those questions do not have explicit answers, the company has not replaced middle management. It has only removed one place where the answers used to collect.
The point is not to defend middle management.
A lot of it was slow, political, redundant, and badly designed. Many companies should remove layers. Many managers should become builders, reviewers, coaches, operators, or owners of actual systems.
But the hidden runtime still has to exist somewhere.
The companies that benefit from AI will not be the ones that flatten the org chart and add agents. They will be the ones that make coordination explicit: state you can trust, permissions you can enforce, owners you can find, failures you can observe, and recovery paths you can execute.
The future is not managerless by default. It is runtime-explicit, or it is lossy.
Ferre Mekelenkamp
Editor, Signal & State. A hands-on technical engineer writing about production AI systems, failure modes, and the architecture between demos and durable product behavior.