Using RPA for Data Cleansing in Application Modernisation


Data cleansing may sound like a simple task but in reality it can be one of the most arduous activities around a system migration / ERP transformation. This is especially true when you’re migrating from a legacy system that has had a high degree of manual data input over many years and the overall data quality is likely low.

Automation technology can go a long way to streamlining this process, in particular Robotic Process Automation (RPA) is a great tool to fast track data cleansing for system modernisation. In this article I will highlight the key benefits and considerations of using RPA for data cleansing and provide an example Process Design Document (PDD) from a previous project.

Data Cleansing for Migration

One of the benefits of a system migration is that you are compelled to look at the quality of your business’ data before transferring it between systems. One of the challenges is how to address poor quality historical data. If you transfer this data as is into the new system you run the risk of impacting the success and reliability of the new system, if you perform a data cleansing exercise that you haven’t planned for adequately you may impact the implementation timeline.

Cleansing data tasks such as removing duplicates, correcting misspelt fields or replacing missing entries can be resource exhaustive and time consuming if carried out manually. Automating these tasks with tools such as RPA will improve the accuracy and efficiency of your data cleansing activity.

Benefits of using RPA for Data Cleansing

RPA technology has a number of functions and features that make it a good tool for data cleansing. Firstly, multiple bots can be deployed to cleanse and validate continuously, accessing various systems, interfaces and data formats at once with high granularity. 

Secondly, RPA’s highly organised, rules based actions have the ability to highlight bad quality data that needs attention, and rectify it. For the small percentage of errors the robot can’t figure out, you can use a combination of patterns with a human “in the loop”, which is where the robot sends any irregular values to a human which are manually fixed and sent back.

Lastly the use of a rules matrix, which is common in RPA implementations to ensure the solution is easy to maintain, is a good method for highlighting areas or activities that are resulting in data quality problems.

Considerations when using RPA for Data Cleansing

Prior to an RPA-led data cleanse, your team must create a highly detailed process design document (PDD), which is the foundation of all RPA builds. It’s imperative to know the rules and standards, and then you can build your rules matrix and design your RPA processes around it. Also, you can pass that on to configure and maintain the RPA model – making sure the robot has all the permissions granted as necessary.

When using RPA for migration the robot needs to be able to comprehend the value in one system and find the identical corresponding field in another. And so in the process of cleansing, all data formats must be brought to a single standard. For example, the time format in two separate systems could be 12 hour and 24 hour, the robot would not be able to match 3:00pm with 15:00. 

Data availability also needs to be considered; when targeting a database with RPA you’re going to need to gain access to certain data, and so does the robot. Sometimes you need to go through the IT system, which can come with delays to receive certain permissions. Knowing which areas you will need access to upfront will minimise any lag time that may come from waiting for access.

Example PDD for Data Cleansing

Download an example Process Design Document (PDD) from a previous project here.

Author Details

Aaron Karlsen
Aaron is a software developer with over 10 years’ experience in the IT industry. Aaron has a strong technical background, with particular expertise around Automation, Robotic Process Automation (RPA), Intelligent Document Recognition (IDR, OCR, ICR), and Business Process Automation (BPA).

You might be interested in these related insights