Fuzzy Matching - mathematical algorithms supporting Tax ID validation

August 4, 2022

Fuzzy Matching is a brilliant feature in the Fonoa Lookup product that helps you charge taxes more accurately and combat tax fraud in over 80 countries.

Background

With every cross-border sale of services, companies face the questions, ‚ÄėShould I be charging tax? and ‚ÄėIf yes, where?‚Äô Knowing if your customer has a valid Indirect Tax ID is a critical part of the answers.¬†

So you ask your customers for a tax ID number, along with all their other information, and calculate their tax rate accordingly.  Most businesses use tax software to verify tax ID numbers. Unfortunately, most software on the market only tells you whether the number meets a certain format or aligns with a checksum. More sophisticated tools will also confirm the existence of numbers in a government database. Sadly, such tools are also frequently limited to just a handful of jurisdictions. 

Many tools, however, do not check if the tax ID actually belongs to the person who supplied it. 

By using Fuzzy Matching, Fonoa adds this extra level of certainty. We tell you the likelihood of a tax number belonging to the person who supplied it, in the form of a percentage match. So you can be confident that you’re charging (or not charging) taxes correctly. 

What is Fuzzy Matching?

Fuzzy Matching (sometimes called Fuzzy Logic or Approximate String Matching) is a technique that helps identify two elements of text, strings, or entries that are approximately similar but are not exactly the same. 

In the context of tax ID validation, it’s a way to compare two text sources (e.g. a supplier name on your records and the name in the government database) and indicate how similar they are by providing a score or a percentage.

How does it work? 

Fuzzy Matching is one element of Fonoa’s Lookup product. This advanced feature compares all the details your customers provide with the information available on government databases. It is the fourth step in Fonoa’s Tax ID validation process:

  1. Format Check: Does the Tax ID have the right format?
  2. Sumcheck: Does the Tax ID conform to the known algorithm?
  3. Database Validation: Does the Tax ID exist in a government database?
  4. Fuzzy Matching: Does the name and address returned by the government database match the name and address you have?
  5. Silent Alarms: Does the Tax ID raise any other concerns?

Using Fuzzy Matching essentially helps to answer two questions - ‚Äėis this tax ID real and does it match the identity of the person who provided it?‚Äô It is different to Database validation which¬† normally only answers the first question: ‚ÄėIs this tax ID real?‚Äô.¬†

‚ÄėDoes the information match?‚Äô needs a sliding scale answer to be of practical help. This is because binary responses tend to generate false positives or false negatives.¬†

Fonoa‚Äôs fuzzy matching algorithm analyses the ‚Äėfuzzy‚Äô data and presents it to you as a percentage match. If the information your client provides you with is exactly the same as the details on a government database, they‚Äôll come back to you as a 100% match. You‚Äôd expect this with large, listed, multinationals.¬†

However, if a customer has given their own name and address, and someone else’s tax ID, they’ll have a very low percentage  match.

Matching errors

In our experience, there are several variables that prevent a 100% match. But they’re not all signs of tax avoidance, like: 

  • Middle names: Included in the government database, but not entered into your online form
  • Name abbreviations: Like Susan and Sue, Oluwafemi and Femi. One¬† is on the official government database, while the other is¬† on your onboarding form ‚Äď a mismatch, but it‚Äôs still the same person
  • Typos: A simple spelling mistake, like Marie instead of Maria
  • Language script: Countries with more than one language may experience some issues when comparing characters (e.g. use of Cyrillic or Latin script)

Knowing how to reduce matching errors by making use of the most appropriate algorithms (Levenshtein, Guth, etc) is the real challenge. And it's why we’re really proud of our Lookup product.

Three use cases for Lookup and Fuzzy Matching 

  • Correct tax calculations: Ensuring the Tax ID belongs to the person who supplied it means you assign B2B and B2C status correctly.
  • Data Sharing: marketplaces and platforms validate Tax IDs before submitting data to governments.
  • Peace of mind: Increased certainty that your counter-parties really are who they say they are and your business is not exposed to tax fraud.

Using Fuzzy Matching as a component of Tax Number Validation is increasingly necessary for companies that are committed to minimising tax fraud and reporting errors. It’s particularly helpful for marketplaces and digital platforms operating globally.

To better understand how Fuzzy Matching works and how Fonoa Lookup will help your business, please get in touch! 

‚Äć

Alexander Kobakhidze
Head of TaxTech @ Fonoa

Alexander is a Tax Technology specialist at Fonoa with in-depth knowledge of indirect taxes impacting the platform and online economy. Prior to joining Fonoa, Alexander was the Global Head of Tax Technology at Uber. At Uber, he and his team worked with internal and external stakeholders to design and build tools that automated tax compliance processes, for both internal use and external platform users. Alexander’s extensive expertise in the effect of indirect taxes on platforms has led him to work with the OECD Working Parties 9 (Consumption Taxes) and 10 (Exchange of Information and Tax Compliance).