Setting up a Front-end Data Quality Firewall

[tweetmeme source=”jeric40” http://]

In a project with an international vendor some years ago, I introduced the concept of splitting the Data Quality Firewall (DQF) in a Frontend and Backend Data Firewall. These terms are spreading and I get question on how you should set up the Frontend DQF. Last query was just this week via Twitter. My focus is not on the technical side, but the usability and reward for operatives and companies.

Why is the Frontend DQF important?

I participated in the Information Quality Conference in London, where it was stated that 76% of poor data is created in the data entry phase. Be proactive in the data entry phase, instead of being reactive (sometime, if ever) later will help you a long way to good and clean data.

Elements of the Frontend DQF.

First identify in which systems data are created. It may be in a variety of systems like CRM, ERP, Logistics, Booking, Customer Care just to mention a few.

Error tolerant intelligent search in Data Entry systems.

Operatives have been taught by Google and other search engines to go directly to the search box to find information. When you search in Customer Entry systems, it is very often you do not find the customer. In order to this you need error tolerance and intelligence in your search functionality, as well as the suggestion feature. This will help you find the entry despite of typos, different spellings, hearing differences and sloppiness. This will be the biggest contributor to cleaner data. A spinoff is higher employee and customer satisfaction due to more efficient work.

If you want to learn more about error tolerance and intelligent search, read these posts:
Making the case of error tolerance in Customer Data Quality
Is Google paving the way for actual CRM/ERP Search?
Checklist for search in CRM/ERP systems.

Data Correction Registration Module

If you did not find the customer and the operatives have to enter the data, you have to make sure the data entered is accurate. You can install a module or workflows that checks and correct the information.

Most CRM systems will only find the customer if you do exact searches

Check against address vendors
If you have a subscription with an address vendor, you can send the query to them, and they can supply you with the most recently updated data. You can set up so it is easy for the operative, and the data will be correctly formatted to your systems.

This is quite easy for one country. If you are an international company, the laws and regulations are different from country to country. In addition the price can run up if you want local address vendors in several countries. It is important that you registration module can communicate with the local vendors, then format and make the entry correctly into your database(s)

Correct the Data Formats

You might choose not to subscribe to online verification by an address vendor. There are still many checks you can do in the data entry phase. You can check:

– is the domain of the e-mail valid?
– is the format of the telephone number correct?
– is the mobile number really a mobile number?
– is the salutation correct?
– is the format of the address correct?
– is the gender correct?

Example of registration module

Check for unwanted scam and fraud

You can check against:

– internal black lists
– sanction lists
– “Non Real Life Subjects”

Duplicate check

Even though duplicates should have been found in the search, you should do an additional duplicate check, when the entry is done.

If you incorporate these solutions, you should be able to control that the data you enter is clean and correct. It should be possible to get it from one vendor.  Then you can use the Backend DQF to ensure the cleansing of detoriating existing data.


Introducing new Thoughts and Concepts of Data Quality Firewall

Data Quality Firewall (DQF)

I have lately worked on a project including the Data Quality Firewall of an international corporation.  In this process we have tried to see what is the best set-up og the DQF and where it should be placed.

Since there may be several definitions of a Data Quality Firewall, I use the definition in Wikipedia:

“A Data Quality Firewall is the use of software to protect a computer system from the entry of erroneous, duplicated or poor quality data. Gartner estimates that poor quality data causes failure in up to 50% of Customer relationship management systems.

Older technology required the tight integration of data quality software, whereas this can now be accomplished by loosely coupling technology in a service-oriented architecture. (SOA)”

The New Concept of the Data Quality Firewall:
The firewall will be set as a workflow process that will do all necessary checks and interpretation to allow only the correct and accurate data to be entered into the database. The workflow will be set up as an integrated process across different systems and databases, based on SOA.

The Data Quality Firewall can be set up at different places to serve different needs. We will also introduce new concepts in the Data Quality Firewall thinking:

A. “Backend Data Quality Firewall” the most common used today
B. “Frontend Data Quality Firewall” set up in the data entry phase
C. “Double Data Quality Firewall” which will ensure the best data quality.

The Data Quality Firewall could include processes like:

A detailed status may be created per record; the detailed status may be analyzed and summarized into a status overview.

All the correction could be made into a interactive report to the supervisor.

Backend Data Quality Firewall

This is the most common used Data Quality Firewall in the market today. The checking is not done in the Data Entry phase, but when the data is transferred from temporary databases to the Master Database.

Even though faulty data is entered in the data entry phase, the Backend Data Quality Firewall will be set up to prevent the irregular data is entered into the Master Database or other relevant databases. The workflows will be set up individually towards the customer, to optimize the firewall according to the nature of the data from each individual system, operator, web and customer service.

The reason for setting up the Backend Data Quality Firewall first is to put the protection as close to the Master Data as possible.

Frontend Data Quality Firewall

As mentioned, the reason for setting up the Backend Data Quality Firewall first is to put the protection as close to the essential data as possible. The challenge with this is that you put the firewall away from where the dirty data is created. The dirty and faulty data is often created in the data entry phase. The reason for this can be many:

  • Operatives cannot find the right customer and re-enters it
  • In a high commission based business, operatives fight for their customer. Customers can be entered with a twist in the name accidentally or on purpose. Either way the operatives will fight for the ownership and commission.
  • Inaccurate or incomplete data can be entered in required fields, just to move on to the next customer

If the Firewall is put in the data entry phase, the amount of dirty data will be drastically reduced. The Firewall will consist of the workflows individually set up to each center/country and FACT-Finder Address Search. The results will be:

  • With the error tolerant search with FACT-Finder it ensured that the operatives find the right customer instantly. This will save time in the search, no need to spend time on register the customer a new. Operatives will get higher job satisfaction and be able to increase the number of calls they can receive.
  • If an operative tries to enter a customer with a little twist in the name they will get a message saying “possible duplicate found, do you wish to continue”. From that window, the operative may jump directly to the found duplicate, to continue working with that identified duplicate.If the operative overrides this message, it can be difficult to argue that it was by accident. This will lead to less infight between operatives, higher job-satisfaction and less double commissioning.
  • Workflows will be set up to check the incomplete and inaccurate data.
  • A monitoring service can also be set up to see if the operatives use the tool available for them. If an operative overrides a duplicate with a “secure match” this action could be logged or sent to a data quality steward to check the quality of matching or the quality of work of the operative.

FACT-Finder Address Search will be implemented in such a way that it is an integrated part of the CRM/ERP system, either it is a self developed or if it comes from outside vendors like, SuperOffice, Microsoft CRM, Siebel or others.

With the Firewall set up in the data entry phase the data that will be sent from the CRM/ERP system will be considerably cleaner. To set up individual workflows will be easier and more secure if is optimized in the centers of Data Entry.

A Double or Multiple Firewall(s)

In the old days you set up more than one wall in the castle to protect yourself. Our idea is to put down a Data Quality Firewall where different needs to be addressed in different ways.

One could set up the 1st firewall in the frontend and optimize the workflows in the Firewall to deal with the challenges that comes in the data entry phase. The 1st Firewall can have interaction with the user and is therefore a powerful solution for data quality (the human factor) Challenges addressed will be:

Correct basic errors as

  • Finding the right customer
  • Incomplete data
  • Data in wrong field – Examples: First name in last name field and or mobile number in the fixed net field.
  • Right salutation and gender.
  • If a duplicate is entered in spite of the search function, the duplicate will be matched to the original record, with a notification that to a supervisor that there was put in a duplicate.

The 2nd firewall will be set up in the backend, and the workflow of this Firewall will be set to deal with the challenges that come in the transfer of the large amount of Data from the front end to the Master Database or optimized databases for CRM, ERP or other specialized system database. It will be working without interaction from users.

Focus of the 2nd Firewall:

  • Settling the “Echo Problem”
  • Building the Customer Hierarchies
  • Worldbase Matching
  • Advanced PAF cleansing and De-duping

The Double Firewall would be highly efficient and would provide the best ROI and results of the solutions.

If you have thoughts and ideas about these concepts, please feel free to contact me!