Canonicalization Beats Workflow

The pattern

Twenty years inside US healthcare clearinghouses and B2B payments produced one observation across surface after surface. The data lands in incompatible formats. The cost of resolving it scales linearly with humans. The vendor category that wins the spend is the one that treats the schema as the product, not the workflow.

The pattern is so consistent across categories that it took me a decade to stop seeing it as three separate problems. Healthcare claim adjudication. EDI clearinghouse modernization. B2B remittance reconciliation. Different verticals. Different buyer titles. Different invoice line items. Same insight underneath.

If you operate in any of those three lanes, you have probably bought the workflow version. You have probably been disappointed by it. The reason is not the vendor. The reason is the category itself.

What the workflow tax actually buys

Watch a mid-market AR team close a single payment. The wire lands Monday. The 835 follows an hour later. The paper stub shows up two weeks after that. Three formats, one dollar, zero clean handoffs to the ERP.

The standard fix is to buy software that helps humans read the three formats faster. Better OCR. Better dashboards. Better exception queues. Cost per line drops. Headcount does not. The category keeps charging $11 a line, every line, forever.

That is the workflow tax. Teams pay it because three incompatible inputs still have to land in one canonical ERP schema, and nobody has built the schema. Translation is the actual work. Workflow software is what you buy when no one has solved translation.

The pattern

Three categories. Same shape. Many inputs collapse onto a canonical schema. The schema is the product. The workflow is the consequence.

Adjudication

EHR notesPayer policiesCPT/HCPCS

Structured claim

EDI clearinghouse

X12EDIFACTHL7

Canonical schema

B2B remittance

835 / EOBWire memoPaper check

Cash-app entry

The workflow vendor sells software that helps a human read the inputs faster. The schema-first vendor builds the canonical representation and lets the downstream automation fall out of it. Different category bet entirely.

Three categories, one canonicalization pattern.

Apex Adjudication: schema is the product

Healthcare claim adjudication looks, on the surface, like an AI problem. Coders parse dense unstructured EHR notes against hundreds of opaque payer policies, deny claims for reasons that read as judgment calls, and create a denial-rework spend that crossed $19 billion in 2022 alone. The natural reaction is to reach for a foundation model. Make the AI smarter than the coder.

That is the wrong instinct. Every denial encodes a small disagreement between a clinical fact and a payer rule. The cost of resolving that disagreement is dominated by humans translating unstructured notes into the structured form that payer logic actually needs. Solve the translation deterministically and the rest of the workflow collapses. The agent does not need to be clever. It needs to be precise about the schema.

The first venture in the trilogy, Apex Adjudication, is the schema-first denial-resolution engine. It publishes a structured representation of an encounter, with provenance for every field, that maps cleanly to every major payer's policy logic. The schema becomes the API. The downstream automation is just the consequence.

Clearing AI: translation as infrastructure

Most enterprise transactions still move on EDI. Healthcare alone runs nine billion claims a year through it. Every modernization project, every cloud-native ERP rollout, runs into a translation layer staffed by retirement-age specialists charging $300 an hour. The talent pool is retiring on a five-year clock and not being replaced.

The instinct, again, is workflow. Buy iPaaS connectors. Hire integration consultants. Treat each new partner as a bespoke project that takes a quarter to land. That instinct produces a $4.7 billion market in healthcare EDI software alone, which is exactly the size of the annual tax for not having canonicalized.

The reason EDI projects run long is not technical complexity. It is that every implementation is a bespoke negotiation between two enterprises about field semantics that should already be canonical. The same loop, every time. The second venture in the trilogy, Clearing AI, treats canonicalization as the product. A structured-data layer that ingests any EDI dialect, emits any modern API shape, and tracks lineage at the field level. The translation becomes infrastructure. The project becomes a configuration.

The vendor category that wins the spend is the one that treats the schema as the product, not the workflow.

Concord Remit: matching is the consequence, not the product

B2B remittance is the third instance of the same pattern. The bottleneck in B2B is not the payment. It is matching the remittance to the invoice across 800-line check stubs, mixed-format ACH addenda, and PDF EOBs. AR teams spend more time matching these to invoices than the actual money movement took. Forty percent of US B2B remittance still arrives by paper.

Established vendors in this category sell rules-based matching with human exception queues. The rules cover the easy cases. The exception queues run on $11-a-line cost. The category compounds.

Concord Remit, the third venture, takes the same posture. Canonicalize any remittance source into a structured representation. Match against AR with confidence-scored output and provenance. The product is the canonicalization, not the matching engine. OCR is a commodity input, the same way payer policies are a commodity input for Apex and EDI dialects are a commodity input for Clearing.

Two category bets

Workflow software

Routes the mess around

What you buy: Dashboards, exception queues, OCR that helps humans read faster
Where the mess goes: Routed around. Three formats stay three formats, forever
Cost curve: $11 a line, every line, scaling with volume
End state: The tax compounds and the headcount stays

Schema layer

Makes the mess impossible

What you buy: A canonical representation with provenance on every field
Where the mess goes: Collapsed at the boundary. One shape exists downstream
Cost curve: Translation solved once. New inputs become configuration
End state: The schema becomes the API. Automation falls out of it

Both vendors see the same three incompatible inputs. One sells a faster way to live with them. The other removes the reason they exist.

The matrix behind the essay: same mess, two opposite products.

Why most operators miss this

If the pattern is so consistent, why does every category in turn end up with a workflow vendor at the top of the spend pyramid?

Two reasons. The first is that the workflow vendor sells to the team that has the budget. AR teams buy cash-application software. RCM teams buy denial-management software. Integration teams buy iPaaS. The buyer wants a tool that helps humans go faster, not a primitive that makes their team smaller. The schema-first approach creates a near-term political problem the workflow vendor does not.

The second reason is harder. Building the schema is much more expensive than building the dashboard. It requires standing inside the data layer of a vertical for years. Most vendor founders have not done that. They saw the workflow pain from the consultant seat or the buyer seat, and built what they could see. The category waits for an operator who lived in the data layer and is willing to write down the schema as a product.

That is the constraint. It is also the opportunity.

What this implies

The trilogy of ventures I publish on this site, Apex Adjudication, Clearing AI, and Concord Remit, is the same insight applied three times. I am not arguing that those three companies are the only ones that should be built. I am arguing that the insight is generalizable and that whoever notices it has years of arbitrage in front of them.

If you are running an AR team, an RCM operation, or an EDI integration program, the question to ask is not which workflow vendor to buy. The question is whether the canonicalized schema for your category exists yet, and if not, who will build it.

If you are an operator with the data-layer experience to build it, the second question is whether you want the workflow vendor to keep collecting the tax for another decade. If the answer is no, the door on this site is open.

If this landed

Discuss on LinkedIn Next: Authorization Without Identity

Sources & related