AI in Accounts Payable: What Actually Works (And What Doesn't)

Accounts payable is the first thing most AI vendors demo to finance teams. The pitch is clean: invoices come in, the AI reads them, codes them, matches them to purchase orders, routes them for approval, and posts them. The demo is impressive. The ROI numbers on the slide are real in principle. The question is whether they are real in your environment.

I have seen enough AP automation projects to know where they deliver and where they disappoint. The gap between demo and production is not a technology problem. It is a data and process problem that the technology exposes rather than creates.

What AI does well in accounts payable

There are specific AP functions where AI-driven automation is reliable and delivers measurable throughput improvement. These are not marginal gains. When the underlying conditions are right, the performance is strong.

Three-way matching on structured invoices from known suppliers. When you have a purchase order, a goods receipt, and an invoice that all reference the same PO number with consistent formatting from a supplier your system already knows, automated matching works at very high accuracy. This is a deterministic task at its core: compare three fields, confirm they reconcile within your defined tolerance, post the result. AI removes the manual handling. Matching accuracy in mature deployments typically runs above 95% on this subset of volume.

Duplicate detection. AI identifies duplicate invoices across variations in formatting, date, and reference number. This is difficult to do manually at volume. A supplier re-submitting an invoice with a slightly different reference number, a copy sent by email after the original arrived by post, an invoice paid in one period and re-submitted in another: these patterns are catchable systematically in ways that manual processing cannot replicate. Duplicate payment recovery in businesses that implement this properly typically runs at 0.5 to 1.5% of AP spend in year one, before it becomes embedded discipline.

Routing based on defined approval matrices. AI-assisted routing works when approval rules are documented and consistent. Invoice from this supplier, this cost centre, this amount range: goes to this approver. When those rules are in the system rather than in someone’s head, automated routing eliminates the manual triaging that otherwise consumes significant processing time.

Exception flagging. AI is effective at flagging invoices that fall outside defined parameters: unusual amounts, first-time suppliers, invoices that arrive without a corresponding PO, invoices where the line item coding does not match the supplier’s usual category. Systematic flagging means those invoices reach the right person faster, not slower.

What the demos do not show you

The demo shows the 80% that works cleanly. It does not show the 20% that accounts for 80% of your processing time. That asymmetry is the honest reality of AP automation.

New supplier onboarding. The AI has nothing to match against when a new supplier sends their first invoice. The supplier master needs to be populated, payment terms confirmed, bank details verified, cost centre allocation agreed. None of this is automated. It is a manual process, and it needs to be: supplier onboarding is where payment fraud risk concentrates. AI can flag new suppliers for heightened review. It cannot do the onboarding.

Non-PO invoices. If your organisation runs significant non-PO spend, three-way matching does not apply. Professional services, subscriptions, utilities, facilities management costs where the invoice arrives without a corresponding PO: these require human judgment about coding, approval, and legitimacy. The degree to which your AP volume is PO-backed is one of the strongest predictors of your achievable automation rate.

Invoices with variations. A supplier invoice where the amount differs from the PO by 3% because of a late delivery charge. An invoice in a currency that has shifted since the PO was raised. A line item that spans two cost centres. An invoice that partially matches one PO and partially matches another. These are common in practice. The AI can flag them. It cannot resolve them.

The 20% that drives 80% of your workload. The automation rate numbers vendors quote are typically measured against total invoice volume. A 90% automation rate sounds impressive until you understand that the 90% being automated is the easy 90%. The 10% being escalated is the complex 10%, and it requires the same total human hours as the old 100% volume did, because each of those invoices is individually difficult. Your headcount reduction is real, but it is not proportional to the automation rate headline.

I have worked with finance teams where a 95% automation rate on invoice volume translated to a 35% reduction in AP processing time. That is still worthwhile. But it is not what “95% automation” implies.

Prerequisites that determine whether automation works

The difference between AP automation that delivers and AP automation that disappoints almost always traces back to what was in place before the tool went in. Four conditions matter most.

A maintained supplier master. Duplicate suppliers, inconsistent naming, missing payment terms, bank details that have not been verified since the supplier was set up: all of these degrade matching accuracy and create exception volume. A supplier master audit before go-live is not optional. It takes time and it is not glamorous work, but it is the difference between an automation rate that matches the vendor’s claims and one that is 30 percentage points lower.

PO discipline. Three-way matching requires a purchase order. If your organisation routinely raises invoices against services where no PO was raised, or where POs are raised after invoices arrive to satisfy audit rather than to control spending, your three-way matching automation rate will be low. PO discipline is a behaviour problem as much as a process problem. The change management dimension of finance transformation is always harder than the technical dimension.

Approval matrices documented in the system. If approval routing is based on informal knowledge of who approves what for whom, automation cannot replicate it. The rules need to exist somewhere other than a spreadsheet or a conversation. If your organisation does not have documented, version-controlled approval policies, that work needs to happen before the automation goes in, not after.

Consistent invoice formats where possible. AI invoice processing works better when you have worked with key suppliers to standardise their invoice format. For high-volume suppliers, a conversation about consistent field placement, consistent PO number referencing, and consistent line item descriptions can meaningfully improve extraction accuracy. This is a procurement relationship task as much as a finance task, but it pays off.

The data quality requirements for AI in finance go deeper on these foundations. The AP context is a specific application of a general principle: your automation rate is determined more by the quality of your data and process foundations than by the quality of the AI tool.

What a realistic return looks like

The ROI in AP automation is real. I do not want an honest account of the hard parts to obscure that. The question is about realistic expectations.

A 40 to 60% reduction in manual AP processing time is achievable with average foundations: a reasonably maintained supplier master, moderate PO discipline, and defined approval policies. Many mid-market finance functions fall in this category. The investment returns within 12 to 18 months at scale.

An 80% or higher reduction is achievable in organisations with high PO discipline, a clean supplier master, consistent supplier invoice formats, and documented approval rules embedded in systems. This is not the average case. It is the case that vendor demos are built to represent.

“Touchless AP” is a vendor claim that requires specific conditions: mature supplier relationships, near-universal PO coverage, highly consistent invoice formats, and a minimal exception rate. It exists. It is not the starting point for most finance teams. It is where a well-run AP automation programme ends up, three or four years into the discipline required to get there.

The measurement framework matters. Before go-live, measure your current cost per invoice processed, your cycle time from invoice receipt to payment, and your error and duplicate rate. After go-live, measure the same things. Do not accept vendor-supplied automation rate metrics as a proxy for business value. Process time and cost per invoice are the metrics that translate to the P&L.

The governance benefit

One AP automation benefit that does not get enough attention in the ROI conversation is the audit trail.

Manual AP processing creates paper records, email trails, and institutional memory. The auditor asking why an invoice was coded to a particular cost centre gets a verbal explanation, if the person involved still works there and can remember. The auditor asking why an unusual supplier was approved gets a conversation about who used to know that person.

AI-assisted AP processing creates a systematic, timestamped record of every decision: the matching logic applied, the confidence score at the time, the approver credentials, the approval timestamp, the escalation path for exceptions. This audit trail is created automatically. It does not depend on documentation discipline from the team.

For businesses in regulated environments, or businesses preparing for investment, acquisition, or audit, this is a genuine commercial value. The AP vendor evaluation framework includes specific questions on audit trail quality that are worth asking before you commit.

The governance benefit does not replace the efficiency ROI in the business case. It belongs in the evaluation alongside the processing time numbers.

Where to start

If you are considering AP automation, the evaluation sequence matters.

Start with a process diagnostic before you look at vendors. Document your current invoice volume by type: three-way matchable, non-PO, new supplier, exception-heavy. The distribution tells you what your realistic automation rate looks like before any vendor shows you a demo.

Then assess your supplier master. Pull the top 50 suppliers by invoice volume. How many have multiple entries? How many have missing or unverified bank details? How many are coded inconsistently? The answer tells you more about your automation ceiling than any vendor conversation.

Then assess PO discipline. What percentage of your invoice volume arrives with a valid PO reference? If the honest answer is below 60%, that is the first problem to solve.

Only after you understand these foundations does it make sense to evaluate specific tools. The AI readiness assessment for finance functions provides a broader diagnostic framework for this kind of preparatory work.

AP automation is one of the higher-value, lower-risk AI applications in finance. The technology works. The question is always whether the foundations are in place for it to work in your specific environment. The foundations are fixable. Start there.

Maebh Collins is a Fellow Chartered Accountant (FCA, ICAEW) with Big 4 training and twenty years of operational experience as a founder and senior finance leader. She writes about AI in finance transformation from the inside out.

Back to Blog | AI in Finance →