The Category Error In AI Disrupts Software

Mark: Sarah, can I run something by you. I tried to write a short memo “The Category Error in AI Disrupts Software.” on the market narrative that generative AI and LLMs will disrupt software. I think I captured the main points, but when I reread it, parts of it do not sit right. Can you review it.

Sarah: Sure. Read it exactly as you wrote it. Let us pinpoint where the discomfort starts. I think I captured the main points, but when I reread it, parts of it do not sit right. Can you review it.

Sarah: Sure. Read it exactly as you wrote it. Let us pinpoint where the discomfort starts.

Mark: All right. The market is not literally saying software disappears. It is saying value and monetization shift from screens and human seats to data, agent execution, and outcomes. So thin apps face price pressure. Meanwhile controls, audits, and permissions become more important. If agents can operate across multiple SaaS tools, users stop opening apps and just ask for work to be done. Seat based pricing weakens, and pricing moves toward usage or outcome. Large platform vendors bundle AI into suites, so standalone vendors run into feature overlap. Development becomes cheaper, barriers drop, and features commoditize. That is the picture.

Sarah: It is a clean memo, but the entry point is tilted. Once the entry point is tilted, everything built on top tilts with it.

Mark: What is tilted.

Sarah: You placed the agent as the execution entity. The moment you framed the shift as data plus execution plus outcomes, you moved execution onto the LLM side. That is the misalignment.

Mark: But when people say agents, they usually mean execution. Autonomous action. Tool calls. APIs. That kind of thing.

Sarah: That is exactly the trap. The word agent has been overloaded. Tool calling is real, and ReAct and Toolformer are good references for how models can interleave reasoning with tool use and learn when to call tools. But that still does not turn them into deterministic execution engines, and it does not mean the model suddenly owns execution integrity. That is why it helps to keep a risk lens in view, like the NIST AI Risk Management Framework and the NIST Generative AI Profile, because real workflows care about accountability, blast radius, and failure modes. The core issue is that LLMs are not decision engines designed for stable execution. They are branching engines designed for exploration. They propose, compare, revise, and improve. That is a different computation mode from deterministic software.

Mark: Different computation mode sounds abstract. Spell it out.

Sarah: Deterministic software is built to map input A to output B with reproducibility. That is why it can safely own state transitions, consistency, permissions, audit trails, and records. LLMs are not built for that. LLMs are built to take an input, branch into alternatives, test language and structure, and elevate the output through iteration. That is much closer to human thinking and design than to machine execution.

Mark: So you are saying LLMs are a substitute for human reasoning, not a substitute for software execution. I follow that. But the market is not claiming full replacement. It is claiming partial automation. If some execution moves to agents, do seats not still compress.

Sarah: Seat compression is not impossible. The mistake is jumping from some compression to software disappears. First we separate what is being automated. What is easier to automate is repetitive work whose substance is input, formatting, copying, and routine coordination. What is not safe to automate is the core state transition that carries accountability and integrity. If you mix those two, the conclusion becomes incoherent.

Mark: But most real work sits between those extremes. Take project work or go to market operations. It is not just copying data. And it is highly specific to each company. That is why I cannot believe a generic agent can just walk in and run it.

Sarah: Your intuition is correct. Company specific domains are exactly where LLMs are strongest on the design side, not the execution side. The LLM can help clarify how the process should run, where exceptions live, which data should be treated as authoritative, and how rules should be expressed. It accelerates specification, design, and iteration. Then deterministic software implements the execution.

Mark: So the real disruption is not LLMs replacing the SaaS runtime. It is LLMs accelerating the rebuild of the runtime through faster design and refactoring.

Sarah: Yes. That is the clean framing. LLMs sit where the architect sits, not where the runtime sits. The competitive effect is that change cycles compress. Configuration, workflow design, and system evolution speed up. That changes the time axis of competition. It does not automatically mean incumbents lose. In many cases, platforms with a real execution core can become more valuable because they can absorb faster design and turn it into reliable operation.

Mark: You said more valuable. In what sense. That still feels hand wavy.

Sarah: Let me make it concrete. Two products can look similar on the surface, both with screens and reports. One is mostly interface and convenience. The other owns the reality of the business. It owns integrity, permissions, auditability, and the state machine that the organization relies on. The second type can accumulate the company’s way of doing things inside the system. When design becomes faster, that second type becomes stickier and more expandable. The value density rises because the product is carrying more of the firm’s true operating structure.

Mark: Then what about the claim that UI value fades because users will not open apps. They just ask.

Sarah: The phenomenon exists, but the conclusion is sloppy. The operating UI can shrink. The design and review UI expands. When LLMs propose changes, humans still need to see diffs, check rationale, validate tests, and approve deployment. The interface does not vanish. It shifts from clicking to execute toward reviewing to govern change.

Mark: That also changes the seat argument. Fewer operator seats maybe, but more design, review, and governance seats.

Sarah: Exactly. And companies rarely stop at efficiency. They reinvest efficiency into higher cadence, more experiments, finer segmentation, broader coverage, and new work. So the question is not whether seats go down in a simplistic way. The question is whether the vendor captures the value created by higher capability.

Mark: That leads to pricing. People say the world moves from seats to usage or outcomes. I struggle to believe full migration. Buyers need predictability.

Sarah: Full migration is an overstatement. In practice it mixes. A stable base remains because organizations budget for fixed commitments. Then incremental capability is monetized as add ons, higher tiers, or constrained variable components. The essential point is not to guess the invoice unit. The essential point is whether the vendor can convert capability into price.

Mark: And the bundling narrative. People claim big suites will swallow the rest. That feels too generic.

Sarah: It is too generic. The outcome depends on where the proprietary data sits, where the company specific process sits, and who owns the execution core that carries accountability. Large platforms can own entry points. But they cannot instantly generalize the state machine of every organization. Bundling wins in some zones and fails in others. A blanket statement is lazy.

Mark: And the development cost argument. Cheaper development means more competition, therefore commoditization.

Sarah: The more important shift is internal. Customers can build and modify faster. They can express their own process in their own language, turn it into specifications, generate configuration and tests, and iterate. That favors systems that provide strong building blocks for safe execution. It weakens products that only sell surface features.

Mark: Building blocks. What do you mean by that.

Sarah: I mean the primitives you can assemble into a safe state machine. Granular permissions, auditable workflows, reliable consistency rules, exception handling, and durable records. If the product offers shallow templates, it cannot absorb company specific structure. If it offers deep primitives, company specific structure becomes a set of reusable diffs. LLMs then accelerate assembling those diffs.

Mark: So my memo went wrong because it treated the agent as the runtime and then concluded the runtime gets replaced. That forces the whole guardrail story.

Sarah: Yes. That storyline is easy to sell because it is intuitive. It is also convenient for anyone who monetizes governance programs. If you want a concrete map of what breaks in real systems, the OWASP Top 10 for Large Language Model Applications is a useful checklist. But the entry definition is off. LLMs are not execution entities. They are design accelerants. If you start from the wrong definition, you get a chain of conclusions that looks coherent but rests on a misfit premise.

Mark: At the end, where should I look. Can you summarize.

Sarah: All right. Let me lay it out. Focus on where responsibility lives inside the product. First, ask whether the system owns real state, meaning it is the place where integrity, permissions, auditability, and durable records are enforced. If it does, faster design makes it more central, not less. Second, ask whether the product can absorb company specific process as structured building blocks that accumulate over time, not as one off customization. If it can, LLMs increase the velocity of that accumulation. Third, ask whether the vendor can translate that increased capability into price, whether through tiers, add ons, expansion, or other mechanisms. If value rises but monetization does not, equity value will not follow.

Mark: That clarifies it. My memo felt off because my first definition pushed execution onto the LLM side. If I reset the definition, the rest of the picture reorganizes. I should stop starting from slogans and start from who owns execution integrity and who benefits from faster design cycles.

Sarah: Exactly. The more powerful the buzzword, the more it collapses distinctions. Fix the definitions first. Then cut the problem by structure. That is how you avoid getting pulled into the market’s narrative loop.