The Agentic EDI Autonomy Scale: Defining The EDI Industry's Next Battlefield

There’s a battle taking place right now between EDI companies over the future of the industry in the context of artificial intelligence.

At one level, the conflict is over terminology. “Agentic EDI” is the leading term, popularized chiefly by OpenText, along with “AI-native EDI” (Orderful, Crstl, Tediware), and “agentic RCM” (Stedi, in the context of healthcare EDI). Companies aren’t just fighting over AI-in-EDI terminology, they’re also clashing over who got there first, with two startups, Crstl and Tangentia, both claiming to have the “world’s first” AI/EDI agent.

Companies aren’t just fighting over AI-in-EDI terminology, they’re also clashing over who got there first.

In the broader context, however, the naming fight is just a skirmish. The real conflict is much more fundamental. Which companies have earned the right to advertise agentic EDI? (Note: I will use agentic EDI, OpenText’s term, for the remainder of this article.) Which companies have bolted on a chatbot and called it a day - what Gartner refers to as “agent-washing” - versus the ones who have built or rebuilt their stacks to take full advantage of AI?

Most importantly, which companies have delivered an experience that fundamentally changes the day-to-day work of EDI integration, maintenance and monitoring, in the same way that AI has fundamentally changed software engineering?

Summary

In this article, I:

Describe a framework - Context, Agency and Trust (CAT) - to evaluate the degree to which an AI system qualifies as agentic EDI.
Define the limited domain, the set of features that is common to all EDI platforms.
Propose the EDI Autonomy Scale, an EDI analogue of the SAE autonomy levels for vehicles, for grading autonomy within the limited domain.
Briefly discuss the impact this progression in autonomous EDI capabilities will have on EDI professionals.
Predict some of the changes this will cause in the EDI industry and how incumbents are likely to fare.

Context, Agency and Trust (CAT)

An AI system must clear the following bars to be justifiably dubbed “agentic EDI”. (Note: I will use AI system or just system to refer to AI-enabled EDI tech stacks, which may comprise multiple agents, agentic processes, UIs, TUIs, etc. I will use AI agent or agent to refer to solitary agents like chatbots.)

It has context: can the AI system see your configuration - its connections, partners, transactions, logs, mappings and implementation guides - or is it answering in the abstract?

Consider an AI agent that is used to diagnose errors with outbound EDI processing. You load up the page that shows the diagnostic error, open the agent, and ask it, “what happened?”

An agent with context is able to answer that question: not only can it see the error, but it is able to reach into the EDI platform via tool calls and analyze what happened - from the moment a request was made to the platform to send EDI and the moment the error occurred.

An AI agent without context has no idea what you are talking about and will ask you to provide the context that the agent in the first example already has: the error message, logs, etc.

It has agency: can the AI system do things, or only say things? A system with agency creates and modifies state - it writes the mapping, creates the partner, updates the connection. A system without it can only describe what should be done and leaves the work to you.

Many companies offer a “customer service agent”. This is frequently a misnomer. The definition of agent is “one that acts or exerts power”, “something that produces or is capable of producing an effect”. Most customer service “agents” are incapable of doing anything other than answering basic questions (they are capable of producing the effect of frustration, but I’ll chalk that up as unintended).

Most customer service “agents” are incapable of doing anything other than answering basic questions.

In the context of EDI, we can test the degree of agency with questions like:

Can the AI system fix (not just diagnose) the error?
Can the AI system create (not just advise on creating) the trading partner?
Can the AI system create the mapping, update the connection, etc.?

Agency is close to binary: a system either acts on state or it only advises. Autonomy asks the next question: if you remove the human, how far can it get? That’s the dimension the scale climbs.

It is trustworthy: is the AI system’s work verifiably correct, i.e. can we trust it?

The most challenging verifiability hurdle is the ability to consistently generate valid X12 EDI that conforms to a trading partner’s specifications, while making sense semantically and from a business perspective.

Suppose we opened a web browser, started a conversation with a chatbot, uploaded a trading partner’s PDF implementation guide for an 810 invoice (outbound to them), and pasted in the data we wanted to appear in that invoice.

If we asked it, “generate a valid X12 EDI document that contains this invoice data and conforms to the trading partner’s guide”, could it do it?

It would certainly go ahead and write data that is plausibly X12. The problem is that we have no way of verifying its validity, let alone that it conforms with the trading partner’s guide. We’d need to verify it via human inspection or route the input into something that can verify it automatically.

Only the latter approach - automatic verification - qualifies as verifiability in the agentic EDI context. Semantic correctness (i.e. whether the right values are in the right places) is essential for an AI system to be fully autonomous, but it’s a much harder problem.

The CAT Diagram

We can use a radar chart to visualize the degree of an AI system’s context, agency and trust. Here’s an example that shows a common type of AI system which has plenty of context and trust, but no ability to do anything: what you might call an intelligent layabout. The aforementioned customer service “agent” is a classic example.

CAT radar chart of an intelligent layabout: high context and trust, no agency

A Limited Domain For Judging EDI Platforms

Every EDI platform has a different set of capabilities. If a narrowly focused platform features agentic EDI across all of its capabilities and a much broader platform features agentic EDI that is limited to a subset of its capabilities, how do we place them fairly on the scale?

Every EDI platform shares a core: trading partners, the documents exchanged between them, the mappings that translate between each partner’s shape, the transport that moves the documents, plus all the quirks that are specific to X12 like envelopes, acknowledgments and control numbers.

That common feature set is the limited domain. It’s what makes something an EDI platform and it’s the common ground on which we can fairly judge any of them.

I’m adding one more constraint to this limited domain: the AI system is working for one trading partner, i.e. on just one side of a commercial relationship between companies. This is the typical scenario of a human EDI analyst or integrator as well. This is important to note because as we shall see, the dynamic of one company talking to another places obstacles in the way of fully autonomous integration.

The EDI Autonomy Scale

Level 0: Manual

There’s no AI at Level 0: this is how EDI has been done for decades. A human reads the trading partner guides, hand-builds mappings, configures transport, runs tests and deals with every exception. Labour and judgment are both human. Integrations take weeks to months and require expensive specialists.

Level 1: Context-unaware advisor with no agency

CAT radar chart for Level 1: a context-unaware advisor with no agency

At Level 1, the AI system can answer questions but lacks the context of your configuration. It’s most likely grounded in platform documentation and has some degree of X12 fluency (note that all the frontier models are well-versed in X12 EDI, so even without grounding, they are capable in the domain), but without any access to your trading partners, logs, errors, etc., it’s blind to your situation and useful mainly as a reference.

Without context, it cannot act, so it fails to meet the bar of agency. Depending on its design, however, its output may be verifiable - but its limited scope severely constrains the usefulness of its output.

Level 2: Context-aware lazy advisor

CAT radar chart for Level 2: a context-aware lazy advisor

At Level 2, the AI system has context - it can see the state of the system - but no agency. Companies at this level often use words like “advise” and “guide” to describe their products.

An example of a Level 2 system is TrueCommerce’s “Truedi”, which is branded as a “support assistant” that delivers “personalized guidance”. Despite saying that Truedi is “powered by agentic AI”, this system cannot act: it can “talk the talk” (generating text) but cannot “walk the walk” (acting on the user’s behalf).

Level 3: Context-aware, trustworthy, acting under human approval

CAT radar chart for Level 3: context-aware, trustworthy, acting under human approval

The AI system has context: it can see the state of the system. The system has agency: using its context, it can generate partner configurations, context-specific answers, functional mappings and error diagnoses. Its output can be deterministically verified using schemas and tests.

But its autonomy is limited: a human operator still approves the work and may also be responsible for some amount of orchestration (for example, AI drafts the error diagnosis and the human copy-pastes the diagnosis into the AI mapping agent for refinement).

This level spans a wide capability range, from well-grounded answers to drafting complex mapping transformations, unified by a single principle: AI proposes, human disposes.

Tediware’s AI-generated mappings are an example of a Level 3 system in production today. Orderful has a similar Level 3 system for mappings, as well as for trading partner guide ingestion, with AI that “reads [a trading partner’s] PDF specification and produces a compliant draft”.

Note the use of the word “draft”, implying that there’s a human-in-the-loop that approves the final output. Once again, AI proposes, human disposes.

Level 4: Context-aware, trustworthy and acting autonomously within the limited domain

CAT radar chart for Level 4: context-aware, trustworthy, acting autonomously within the limited domain

At Level 4, the AI system has sufficient context and tooling to autonomously accomplish all the tasks within the limited domain. It only escalates to a human when a task is genuinely ambiguous or off-platform.

When you onboard a new partner, you typically receive a set of implementation guides (specs) from them, along with operational guidelines and access to a testing environment (e.g. a test SFTP mailbox). At Level 4, the system is capable of processing this initial input and building end-to-end data processing flows that receive inbound EDI data, store it in a system of record, and respond with outbound EDI data that verifiably complies with the guides.

In other words, the system can take a trading partner relationship from not-connected to connected-and-operating largely on its own. The human sets direction and approves outcomes rather than performing the steps. This is the bar for real agentic EDI, and it’s the highest level that is achievable today. To my knowledge, no company has attained this level.

Level 5: Fully autonomous

CAT radar chart for Level 5: fully autonomous

This is the asymptote of the scale, where EDI happens without any human involvement at all, including the parts that happen off-platform and feature human gatekeepers, like certifications and credential exchanges. This is “machines talking to machines”, and it’s a future that may be out of reach for a long time: not because this is technically impossible, but because the other side of the relationship is an adversarial or human-gated process.

Companies that currently have a large customer base (especially if that base is dissatisfied) may be especially unmotivated to adopt technologies that empower nimbler entrants to transact with their customers, because the more they can bog down their younger competitors, the less competition they face. Of course, this also has the effect of making their customers increasingly motivated to consider alternatives.

Bad Combinations

This hierarchy does not describe all possible combinations of context, agency and trust, rather, it describes the set of responsible combinations.

It is possible to develop an AI system with a high degree of agency and a low degree of trust and verifiability - what you might call a highly empowered rogue agent - but this would be bad. You could also develop an AI system with a high degree of agency and a low degree of context - what you might call a highly empowered ignoramus - which would also lead to unfortunate outcomes.

CAT radar charts of two bad combinations: the highly empowered rogue agent and the highly empowered ignoramus

Despite the obvious risks of both of these combinations, I do expect some companies will create AI systems with precisely these characteristics. Caveat emptor.

Impact Of The Scale On The EDI Professional

Every increase along the scale removes toil and elevates human judgment. The optimistic prospect is EDI professionals who are freed from the drudgery of hand-crafted mappings and able to focus on trading partner relationships, improved business processes and practices, and higher-level business impact, all leading to improved productivity and growth.

Every increase along the scale removes toil and elevates human judgment.

The reality is more nuanced. Some EDI professionals are undoubtedly exposed to the risk of displacement as companies move along this scale.

I expect the impact to be similar to what has happened to software engineers. Like software, EDI implementations are verifiable, which makes them suitable for automation in a way that, say, being a fourth-grade teacher is not. But also like software, EDI exists to serve the desired outcomes and relationships of human beings.

In my own case as a developer, I have experienced this period of technological change as a profound mixture of fear and excitement: fear about what the future means for the craft I spent two-and-a-half decades refining, and excitement about the creative possibilities that these new tools have unlocked.

Impact Of The Scale On The EDI Industry

As I stated earlier, I expect reaching Level 5 to take a long time. What is within reach with today’s technology is Level 4 (arguably, no company has achieved Level 4 yet - if you disagree, I’d like to hear from you). Tediware is working towards this level and is certainly not alone in that goal. What are the implications for the EDI industry as EDI platforms reach this level?

As Yogi Berra said, “it’s tough to make predictions, especially about the future”. Nonetheless, here are mine.

1. Large incumbents will struggle to deliver cohesive Level 4 capabilities.

I was recently asked, “What do you think will become the core differentiating factor as more EDI platforms bake AI into their offerings?” My answer to this question is simple: I don’t think the large incumbents have the DNA to deliver compelling, holistic AI experiences.

A useful case-in-point is their failure in another key area, namely UI/UX design. Good design has always been important, but it was Steve Jobs who did more than anyone else to popularize its importance, and it was the iPhone which made mobile compatibility an essential part of modern web application design.

Despite that, the UI/UX of legacy incumbents is terrible, with mobile-incompatible interfaces that would not look out of place in the mid-90s. It’s been almost 20 years since the launch of the iPhone. Will companies which still haven’t managed to deliver compelling UX deliver compelling AI?

Arguably, UX is the easy part: it’s simpler to rewrite the front-end of a web app than it is to rebuild fundamental parts of a legacy tech stack while processing millions of transactions a day.

It’s also genuinely difficult to build the context-and-agency plumbing that AI systems need to reach Level 4.

The challenge goes well beyond technology. The culture, personnel and established processes at incumbent companies make it extremely difficult to pivot. These companies need to shift away from the assumptions that made them successful in the first place. The past few decades are littered with the bones of tech companies that were once on the leading edge and are now footnotes in Harvard Business Review case studies.

Some companies have hostages, not customers. Nowhere is this more true than the world of EDI, where the complex nature of business process integrations creates intricately intertwined systems that are profoundly sticky. Modern platforms which use JSON as their canonical data layer, provide excellent APIs and yes, good UX, have been around for at least a decade, but the large incumbents are still the large incumbents.

Some companies have hostages, not customers.

Nonetheless, cracks are appearing in the foundation. SPS Commerce added just 550 net new customers in all of 2024 - a 1.2% growth rate, down from 13% only two years prior. The floor is holding, but the ceiling is gone: incumbents are keeping the customers they have largely because switching is painful, not because they’re winning. The moat is real, but it’s not getting any wider.

Meanwhile, there is a material possibility of sudden, market-shifting disruptive events, especially cyberattacks. It was the ransomware attack on Change Healthcare that changed everything for Stedi, powering their pivot to a healthcare clearinghouse. The threat landscape is growing more challenging by the day. Sprawling, legacy tech stacks are especially vulnerable.

3. The fight between agentic EDI companies will be messy, with the possibility of race-to-the-bottom dynamics that pressure margins.

Orderful has raised more than $40MM USD. I’m sure a significant amount of that has gone towards marketing and sales, and the value of the distribution that has purchased should not be underestimated. But I’m also sure huge sums have been spent on product development - development that fast followers can now do much more quickly and cheaply than before.

The problem for those fast followers is that there may be still more fast followers. If feature sets converge and customers have many more options, the market becomes commoditized, competition on price emerges as a key factor, and margins shrink.

A world where large incumbents jealously protect their market share by gate-keeping integrations while being attacked by a horde of upstart agentic EDI competitors is going to be messy. It’s also not going to be boring!

This essay was reviewed by Monish Gandhi, Founder & CEO of Gradient Ascent. For feedback or corrections, email me at adrian@tediware.com