MITS Blog | Why MES Integration Fails After Go-Live (And How to Fix It)

Most MES integration issues do not appear during implementation, they show up weeks or months after go-live. As a new plant starts to operate, little tweaks and changes are constantly being made to optimize and refine processes and procedures. A new product is introduced. An ERP field changes. A lineside device is replaced. What was stable when implemented now has to adapt and can begin to drift. This is not a tooling problem; it is an architecture and ownership problem.

Why Integrations Look Fine at Launch

During implementation, integrations are built to a predetermined and fixed snapshot of the plant. ERP structure, devices and PLCs, routings, part numbers and process assumptions are all controlled. Interfaces are tested against expected scenarios, and all data flows appear clean, but the plant is about to change.

Production environments are defined by change. Engineers update routings, IT modifies fields and security policies, control teams replace or reconfigure devices, and quality teams introduce new checks. If the integration architecture does not account for this, failure is delayed, not avoided.

Failure Mode 1: Unclear Data Ownership

One of the most common post-go-live issues occurs when two systems think they own the same data. For example: the ERP updates order status to “complete” based on a transaction but MES still has the unit in rework due to a failed inspection. Now, both systems are technically correct based on their logic, but the plant has conflicting truth.

These conflicts often show up as mismatched production reports, WIP that cannot be reconciled, and manual overrides to “fix” data.

In many plants, teams respond by adding more logic or more interfaces. This makes the problem worse. The root issue is that ownership was never clearly defined. Who owns order statuses? Unit completions? Genealogy records and quality dispositions? If the answer is “it depends,” the system will drift over time.

This is why integration failures are often described as “data issues” when they are actually ownership issues.

Failure Mode 2: Point-to-Point Interfaces

Point-to-point integration works at well at first. Direct connections are built, ERP to MES, MES to PLCs, MES to quality system, PLCs back to MES, etc. Each interface solves a specific need. Over time, the number of connections begins to grow. New devices are added, additional data points are required, and exceptions are handled with custom logic. What starts as a clean setup becomes tightly coupled.

Now consider a real scenario: a controls engineer replaces a torque tool with a newer model. The new device uses a different data format, sends results at a different time, and includes additional parameters. The MES interface must change, but that same data is also used in a quality system, referenced in ERP reporting, and tied to traceability records. A small device change now impacts multiple systems.

This is where integrations begin to fail: data arrives out of sequence, fields no longer map correctly, and downstream systems reject or misinterpret data.

Point-to-point architectures amplify the cost of change. They do not fail immediately; they fail when the plant evolves.

Failure Mode 3: No Change Governance

Even with solid initial design, integrations degrade without governance. Take a common scenario: an ERP upgrade or configuration change. An IT team updates field names, transaction structures, or status logic. The change is valid from an ERP perspective, but MES is still expecting previous field formats, specific status transitions, and known event timing.

Now orders fail to release correctly, confirmations do not post, and production appears stuck or duplicated.

Another example, a new product introduction. Engineering adds new routing steps, different inspection requirements, and/or alternate material flows. If MES and its integrations are not updated in a controlled way, operators create workarounds, steps are skipped, and traceability becomes inconsistent.

The integration did not “break” in a visible way, it degraded quietly through exceptions and manual fixes. Over time, this erodes trust in the system.

Using ISA-95 to Understand the Problem

A useful way to frame these failures is through ISA-95.

At a high level:

Level 4 (ERP) manages business planning and transactions
Level 3 (MES) manages execution and production records
Level 0–2 (controls and devices) manage physical processes

Most integration issues happen when these boundaries are not respected.

Examples:

ERP trying to control real-time execution states
MES relying on PLC data as the only production record
devices sending data without context of the production unit

ISA-95 is not just a model. It is a way to define responsibility. If each level has clear ownership:

ERP defines what should be built
MES defines how it was built and controls define what physically happened

Then integration becomes structured. Without this, systems overlap, and conflicts are inevitable.

Why problems surface after go-live

All three failure modes share a pattern: they are exposed by change.

At go-live:

ownership gaps are hidden
point-to-point connections are manageable
governance is not yet tested

After go-live:

changes accumulate
exceptions increase
manual workarounds appear

Eventually, the system no longer reflects how the plant actually runs.

This is what leads teams to say:

“The integration is fragile.”
“We cannot trust the data.”
“No one wants to touch the interfaces.”

What This Means for Integration Design

If integrations fail after go-live, the problem is not connectivity. It is structure.

In stable environments, integrations are built around clear system responsibilities.

In practice, that looks like:

ERP sends planned data such as orders, schedules, and BOMs
MES owns execution, including order status, routing progression, and genealogy
Machines and PLCs provide process data, but do not define production state
Interfaces are event-driven and monitored, not tightly coupled point-to-point logic

This changes how the system behaves when the plant changes.

When a new product is introduced, MES absorbs routing and workflow changes without breaking ERP interfaces.
When a device is replaced, only the MES interface layer changes, not every downstream system.
When ERP fields or transactions change, they do not redefine what happened on the shop floor.

The goal is not to prevent change. It is to make sure change does not break the system.

See how MITS structures integrations and defines system boundaries here