Making the case of error tolerance in Customer Data Quality

[tweetmeme source=”jeric40”

Sales people are the ones who complain most about poor data quality and at the same time probably the ones who create most of the dirty data. 76%  of the dirty data is created in the data entry phase. Why not make it easier by introducing some error tolerance in their CRM/ERP Search, Data Quality Firewall, Online Registration and in the data cleansing procedures?

Why is dirty data created?

There can be multiple correct spellings of a name
Let’s say your customer Christopher Quist calls you? I have gone through the name statistics in the Nordics. There are 10 ways that Christopher is spelled and 7 ways Quist is spelled.  This means there are 70 possible correct ways to write his name!

Christoffer Kvist

How big chance is it that the customer care or sales representative hits the correct form? It can be unprofessional to ask Christopher many times, it is time consuming and irritating. With an error tolerant search – the representative would find it immediately.

People hear differently.
I used to work at the Nordic call center for Dell in Denmark. I would hear and spell a name differently than the Danes. The most common way to write Christopher Quist in Norway would be Kristoffer Kvist and in Denmark it would be Christoffer Qvist. In the Nordic Call Centers it is not uncommon to answer telephones from another country, and therefore the chances of “listening” mistakes grow.

People do typos.
In the entering process it is easy to skip a letter, do double lettering, reverse letters, skip spaces, miss the key and hit the one beside it, or insert the key beside the one you hit. If we do all these plausible typos with the most Common way to write Christoffer Qvist in Danish – it would generate 314 ways of entering the name! The Norwegian version of Kristoffer Kvist would generate 293 plausible typos!

Sometimes people believe it is easier or safer to just enter the data again.

Other mistakes error tolerance covers

  • Information written in the wrong field (contact name in the  company field)
  • Information is left out (Miller Furniture vs Millers House of Furniture)
  • Abbreviations (Chr. Andersen vs Christian Andersen)
  • Switch the order of the words (Energiselskabet Buskerud –Buskerud Energiselskab)
  • Word Mutations (Miller Direct Marketing – Müller Direct &
    Dialogue Marketing)

What will the result be for you if you have error tolerance?

  • Cost reduction – if you have a call center of 100 persons and they would save 20 seconds for each call. They could start immediately serving the customer, instead of making the customer spell their names.
  • Happy customers – it is annoying to always have to spell out the information to a sales representative if you want to buy something.
  • Happy workers – it is annoying trying to find a customer you know is in the system – but cannot find!  You spoil valuable selling time

Introduce true error tolerance today!


Survey shows you can realize 70% more revenue based on Data Quality

Just before Christmas a interesting survey was released by SeriousDecision, a source for business-to-business sales and marketing best practice research and data.

They really confirms some of my views on the importance of good data quality in the sales process.

1st confirmation:
The cost of poor data is not taken serious at the senior management

“Most b-to-b marketing executives lament the status of their databases, but have difficulty convincing senior management of the gravity of the problem,” notes Jonathan Block, SiriusDecisions senior director of research . Mr. Block continues, “The longer incorrect records remain in a database, the greater the financial impact. This point is illustrated by the 1-10-100 rule: It takes $1 to verify a record as it’s entered, $10 to cleanse and de-dupe it and $100 if nothing is done, as the ramifications of the mistakes are felt over and over again.”

I have stressed this in several of my earlier posts:, as well as described both the 1 in 10 rule and 1-10-100 rule

2nd confirmation:
The problem is tremendous.

“Focusing on b-to-b sales and marketing best practices, the firm has found that from 10 to 25 percent of b-to-b marketing database contacts contain critical errors — ranging from incorrect demographic data to lack of information concerning current status in the buying cycle.”

By the way, the tests we have done in the Nordics show the there is from 5% to 35% errors in the databases, the average is 16,7% errors.

3rd Confirmation:
Ongoing cleansing is more important than the one time approach.

“Organizations must shift their focus from one-time data cleansing to ongoing data maintenance to turn the tide,” says Mr. Block. “The good news is that we’re seeing a strategic shift in approach in strong organizations, from one of data cleansing (a project with a set completion date) to data maintenance (ongoing policies and procedures to maintain data quality). The fundamental trouble with one-time data cleansing is that the day the project ends, the data is the cleanest it will be until the next round of contacts is added to the database.”

This is commented in this earlier post

4th Confirmation:
The upside is huge

SiriusDecisions also estimates that organizations with an early-phase data strategy can expect a roughly 25 percent uplift in conversion rates between the inquiry and marketing qualified lead stages.

Using an example of a prospect database of 100,000 names at the outset and a constant campaign response rate of two percent, a strong organization will realize nearly 70 percent more revenue than an average organization purely based on data quality. For those marketing executives having problems convincing senior management that a permanent process upgrade rather than ‘quick fix’ will pay big dividends in the long run, this is the kind of eye-opening statistic that should prove invaluable.”

It is interesting to see these kind of estimates, since making easy ROI calculation for Data Quality Projects is difficult.

Another Calculation Method – The 1 – 10 – 100 Method

I have previously described the 1-in 10 rule.  In the article “The real Cost of Bad Data”  it is described how industry analysts had made the 1-10-100 Method.

The average cost of correct entered contact information into the master database cost
$1 is includes data validation solutions, wages for the employee and cost of computer equipment.

If you do the adress validation and de-dupication after the the submission of data in a batch cleansing, the average cost per entry is $10. 

With the commonly used Laissez-Faire solution (doing nothing) the cost is greater than $100 per record.

It is also stated in the the article that up to 20% of the contact data is wrong when it is

A quick calculation.  You have 100.000 records with 5% incorrect data.  Then the bad
data costs you 100.000 x 0,05 x 100: $500.000 or 2.500.000 DKK

Time to end Laissez-Faire!

ROI Tip 1 – Save operating cost from month 1.

Though our tests in the Nordic Regions it shows that organizations and businesses have from 5-30 percent duplicates in their databases.  What is the price for the duplicates?

In an article in DM Review Thomas C. Redman comes with this assesment of the cost of bad data:

“Consider first the cost of efforts to find and fix errors. While organizations do, from time to time, conduct massive clean-up exercises, most efforts to find and fix errors are embedded in day-in and day-out work. Over the years, we developed the Rule of Ten: If it costs $1.00 to complete a simple operation when all the data is perfect, then it costs $10.00 when it is not (i.e., late, hard to interpret, incorrect, etc.).”

In my example I will use the price of 1 DKK pr record and 10 DKK for incorrect data.  I will use the conservative 5% duplicate.

Cost of poor data

The Solution:
In this solution I have used the rentalprice of Omikron AddressCenter.  With rental you can deduct the whole cost in the operating costs, whereas if you buy the solution it will be in the investment costs.

Omikron AddressCenter or Data Quality Server is through test proved to be the most intelligent, efficient and easy-to-use tool to find/match duplicates.