AI Won’t Make Your Tax Determinations Perfect — And That’s Fine | Fonoa

Ask any tax leader about adopting AI and the first question is always some version of: "But will it be accurate?"

The question makes sense. Tax compliance demands precision. A single misclassification can cascade into penalties, audits, and sleepless nights explaining to the board why you're restating filings. When someone pitches probabilistic AI that's "90% accurate," the natural response is panic. Your current process doesn't advertise a 10% error rate.

Except it probably has one. You just don't measure it the same way.

This is the second piece in our series on preparing indirect tax leaders for AI. In the first article, we explored why tax systems can't be both perfectly certain and infinitely adaptable. Now we tackle the accuracy question head-on: what does "accurate" even mean when you're comparing human judgment, rule-based systems, and probabilistic AI? And why might a system that makes occasional mistakes actually make your tax function more compliant overall?

The answer requires rethinking what accuracy looks like at scale.

AI gets things wrong in unfamiliar ways

One of the most pressing concerns for tax leaders is accuracy. Will embracing probabilistic AI make tax determinations and filings less accurate or more accurate overall? The answer requires reframing what we consider "accurate" and over what scope.

AI tends to be "wrong" in different ways than humans or traditional tech.

A human tax analyst might make a mistake due to fatigue or oversight, misreading a tax law or keying in the wrong code on an invoice. A rule-based system, as we discussed in the previous article, errs by omission. It gives no answer or a default answer when confronted with a scenario outside its rule set (a new type of digital service that wasn't in the logic, for example).

An AI, on the other hand, might confidently give an answer for every scenario. It has high "recall," meaning it won't just throw up its hands at edge cases. But a certain percentage of those answers will be somewhat off. Maybe it interprets context incorrectly. Or in the case of a generative model, it produces a superficially plausible but incorrect statement (an AI hallucination).

These aren't better or worse errors. They're just different. And understanding the difference is crucial for managing them.

Breadth of analysis can compensate for margin of error

Here's where it gets interesting. AI's breadth of analysis can compensate for its margin of error.

A well-trained AI system can rapidly analyze all transactions or all tax law changes, whereas a human or fixed system samples or focuses only on known hotspots. An AI reviewing expense claims might flag 5% of claims as anomalous, some of which turn out to be fine (false positives). But in doing so it might catch all of the truly problematic cases that a manual review would miss.

In contrast, a human auditor might examine only 10% of claims due to time constraints and miss some fraud or errors entirely, though they might be 100% accurate on the ones they did check.

The acceptable error rate in AI becomes a business decision. Is 93% accuracy at scale better than 100% accuracy on a much smaller subset of issues? Often, yes.

As one tech commentator put it, "We don't need AI to be perfect. We need it to be scalable, reviewable, and governed." A slight dip in per-item accuracy can be outweighed by the huge increase in coverage and speed. AI augments compliance by catching issues across volumes of data that a purely manual or deterministic approach would never scrutinize.

Confidence thresholds let humans handle the tricky bits

Consider compliance document generation or tax research. A generative AI might draft a tax advisory memo in seconds. Will it be perfect? Probably not. It might cite one case incorrectly or miss a nuance. But a tax professional with AI assistance could review and correct those errors far faster than writing from scratch, achieving a compliant result in a fraction of the time.

Similarly, an AI could initially classify 1,000 invoices with 90% accuracy. If tax staff then review the 100 flagged or uncertain cases, the end result could be 100% accurate on all 1,000. That's a dramatic productivity boost.

In this sense, AI's mistakes can be managed and become just another part of the workflow, much like we double-check junior staff work.

Leading practice is emerging to designate certain accuracy thresholds and confidence scores for AI outputs, routing anything below a confidence threshold to human review. A machine learning tax classification model might tag each transaction with "95% confidence Standard-rated" or "60% confidence, unsure." The high-confidence items can auto-post, while low-confidence items go to a person.

This way, the overall process achieves both scale and assurance. The AI does the heavy lifting, humans handle the tricky bits.

Your current process isn't as accurate as you think

Empirically, there's reason for optimism that AI can make a tax function more compliant overall.

Studies in analogous fields like auditing show that AI tools can significantly reduce errors and uncover issues humans miss. AI systems can examine 100% of transactions or data instead of a sample, and they don't get tired or inconsistent. One study noted that AI-driven audit processes allow "whole population testing," checking the entirety of data, whereas manual audits sample and might overlook irregularities. The result was a reduction in undetected misstatements and anomalies when AI was applied.

Companies that invested in AI saw fewer errors and restatements in financial audits, indicating higher quality outcomes. In the tax compliance context, this could mean fewer filing errors or missed obligations.

Of course, AI can also introduce new kinds of errors, so the net benefit depends on robust controls. But it's increasingly clear that human processes are far from 100% accurate.

Manual data entry and spreadsheet-based tax computations have notoriously high error rates. Field studies have found material errors in nearly 20% to 90% of complex spreadsheets used in finance and tax reporting. Humans also suffer from cognitive biases and inconsistency. Two tax experts might reach two different conclusions on an ambiguous scenario.

AI's advantage is consistency and tirelessness. It will apply the same logic uniformly and can run every day without losing focus.

Measure compliance at the system level, not per transaction

The key is to measure compliance at the system level, not per transaction.

AI might incorrectly classify a few transactions' tax treatment, but if it also flags dozens of potential underpayments that would have slipped by, the overall compliance (total correct tax outcomes) could improve. Tax leaders should start thinking in terms of "better decisions at scale" rather than "perfect answers in isolation."

A practical approach is to run A/B tests or pilots comparing traditional versus AI-augmented workflows. Does the AI help the team catch more issues or file more accurate returns over a quarter? Early case studies suggest yes.

For example, an indirect tax review aided by AI anomaly detection spotted patterns of underreported VAT in certain branches that the rules-based monitoring failed to notice. This resulted in a fix that saved penalties down the line, even though the AI also raised some false alarms that took time to clear.

The bottom line is that probabilistic AI will occasionally err, but so do humans. The focus should shift to error management and continuous improvement rather than zero errors.

As one thought leader aptly said, "AI might be probabilistic, but so is human memory." Embracing that reality is part of the paradigm shift.

Key takeaway: System-level compliance matters more than per-transaction perfection

Perfect accuracy on every transaction is a false goal. It's not achievable with humans, rules, or AI.

What matters is whether your overall compliance posture improves. Whether you catch more errors, cover more ground, and free your team to focus on judgment calls instead of data entry. Whether you can analyze 100% of transactions instead of sampling 10%.

The shift to AI-powered tax workflows isn't about accepting lower standards. It's about raising your expectations for what's actually possible at scale.

‍

Continous compliance

The Practical Guide to Tax ID Validation for Global Compliance

AI Won't Make Your Tax Determinations Perfect (And That's Fine)

AI gets things wrong in unfamiliar ways

Breadth of analysis can compensate for margin of error

Confidence thresholds let humans handle the tricky bits

Your current process isn't as accurate as you think

Measure compliance at the system level, not per transaction

Key takeaway: System-level compliance matters more than per-transaction perfection

Check out more in this blog series:

Preparing for the Era of AI if You Manage Indirect Tax

AI for Indirect Tax: Deterministic vs. Probabilistic Reasoning

Blog Series: Preparing for the Era of AI if You Manage Indirect Tax

Dive into more on this topic:

Preparing for the Era of AI if You Manage Indirect Tax

AI for Indirect Tax: Deterministic vs. Probabilistic Reasoning

Blog Series: Preparing for the Era of AI if You Manage Indirect Tax

Dive into more on this topic:

Preparing for the Era of AI if You Manage Indirect Tax

AI for Indirect Tax: Deterministic vs. Probabilistic Reasoning

Blog Series: Preparing for the Era of AI if You Manage Indirect Tax

SYNAPSE 2026: Where Tax Meets Intelligence

Rob van der Woude

Related resources