I have lately worked on a project including the Data Quality Firewall of an international corporation. In this process we have tried to see what is the best set-up og the DQF and where it should be placed.
Since there may be several definitions of a Data Quality Firewall, I use the definition in Wikipedia:
“A Data Quality Firewall is the use of software to protect a computer system from the entry of erroneous, duplicated or poor quality data. Gartner estimates that poor quality data causes failure in up to 50% of Customer relationship management systems.
Older technology required the tight integration of data quality software, whereas this can now be accomplished by loosely coupling technology in a service-oriented architecture. (SOA)”
The New Concept of the Data Quality Firewall:
The firewall will be set as a workflow process that will do all necessary checks and interpretation to allow only the correct and accurate data to be entered into the database. The workflow will be set up as an integrated process across different systems and databases, based on SOA.
The Data Quality Firewall can be set up at different places to serve different needs. We will also introduce new concepts in the Data Quality Firewall thinking:
A. “Backend Data Quality Firewall” the most common used today
B. “Frontend Data Quality Firewall” set up in the data entry phase
C. “Double Data Quality Firewall” which will ensure the best data quality.
The Data Quality Firewall could include processes like:
A detailed status may be created per record; the detailed status may be analyzed and summarized into a status overview.
All the correction could be made into a interactive report to the supervisor.
Backend Data Quality Firewall
This is the most common used Data Quality Firewall in the market today. The checking is not done in the Data Entry phase, but when the data is transferred from temporary databases to the Master Database.
Even though faulty data is entered in the data entry phase, the Backend Data Quality Firewall will be set up to prevent the irregular data is entered into the Master Database or other relevant databases. The workflows will be set up individually towards the customer, to optimize the firewall according to the nature of the data from each individual system, operator, web and customer service.
The reason for setting up the Backend Data Quality Firewall first is to put the protection as close to the Master Data as possible.
As mentioned, the reason for setting up the Backend Data Quality Firewall first is to put the protection as close to the essential data as possible. The challenge with this is that you put the firewall away from where the dirty data is created. The dirty and faulty data is often created in the data entry phase. The reason for this can be many:
- Operatives cannot find the right customer and re-enters it
- In a high commission based business, operatives fight for their customer. Customers can be entered with a twist in the name accidentally or on purpose. Either way the operatives will fight for the ownership and commission.
- Inaccurate or incomplete data can be entered in required fields, just to move on to the next customer
If the Firewall is put in the data entry phase, the amount of dirty data will be drastically reduced. The Firewall will consist of the workflows individually set up to each center/country and FACT-Finder Address Search. The results will be:
- With the error tolerant search with FACT-Finder it ensured that the operatives find the right customer instantly. This will save time in the search, no need to spend time on register the customer a new. Operatives will get higher job satisfaction and be able to increase the number of calls they can receive.
- If an operative tries to enter a customer with a little twist in the name they will get a message saying “possible duplicate found, do you wish to continue”. From that window, the operative may jump directly to the found duplicate, to continue working with that identified duplicate.If the operative overrides this message, it can be difficult to argue that it was by accident. This will lead to less infight between operatives, higher job-satisfaction and less double commissioning.
- Workflows will be set up to check the incomplete and inaccurate data.
- A monitoring service can also be set up to see if the operatives use the tool available for them. If an operative overrides a duplicate with a “secure match” this action could be logged or sent to a data quality steward to check the quality of matching or the quality of work of the operative.
FACT-Finder Address Search will be implemented in such a way that it is an integrated part of the CRM/ERP system, either it is a self developed or if it comes from outside vendors like Salesforce.com, SuperOffice, Microsoft CRM, Siebel or others.
With the Firewall set up in the data entry phase the data that will be sent from the CRM/ERP system will be considerably cleaner. To set up individual workflows will be easier and more secure if is optimized in the centers of Data Entry.
A Double or Multiple Firewall(s)
In the old days you set up more than one wall in the castle to protect yourself. Our idea is to put down a Data Quality Firewall where different needs to be addressed in different ways.
One could set up the 1st firewall in the frontend and optimize the workflows in the Firewall to deal with the challenges that comes in the data entry phase. The 1st Firewall can have interaction with the user and is therefore a powerful solution for data quality (the human factor) Challenges addressed will be:
Correct basic errors as
- Finding the right customer
- Incomplete data
- Data in wrong field – Examples: First name in last name field and or mobile number in the fixed net field.
- Right salutation and gender.
- If a duplicate is entered in spite of the search function, the duplicate will be matched to the original record, with a notification that to a supervisor that there was put in a duplicate.
The 2nd firewall will be set up in the backend, and the workflow of this Firewall will be set to deal with the challenges that come in the transfer of the large amount of Data from the front end to the Master Database or optimized databases for CRM, ERP or other specialized system database. It will be working without interaction from users.
Focus of the 2nd Firewall:
- Settling the “Echo Problem”
- Building the Customer Hierarchies
- Worldbase Matching
- Advanced PAF cleansing and De-duping
The Double Firewall would be highly efficient and would provide the best ROI and results of the solutions.
If you have thoughts and ideas about these concepts, please feel free to contact me!
Filed under: Data Quality Firewall, General, Tips to Optimize Data Quality Tagged: | Backend, Backend Data Quality Firewall, Data Quality, Data Quality Firewall, Double Data Quality Firewall, DQF, Frontend, Frontend Data Quality Firewall, Multiple Data Quality Firewall