February 17, 2021

Fraudulent financial transactions are mostly responsible for revenue leakage in any organization. Using artificial intelligence (AI), anomalous and fraudulent transactions made with credit cards or online portal can be identified instantly on real-time basis. The backbone of this solution is ‘Data Science’.

For example, most of the banking giants rely on the AI to capture anomalous transactions and avoid any such fraudulent activities. With the help of big data and machine learning, the viability of building AI models to handle real life problems has increased immensely.

Anomaly (or outlier) detection deals with identification of rare items or transactions which seem suspicious and significantly different from most of the transactions. The rare events happening in daily activities could be considered as anomalies and these could be interesting from business perspective.

Outlier detection in human resource management system (HRMS) for employee claims is another classic example or use case for AI. Employees raise claim according to their expenses for any valid business need such as business travel, mobile bill, food bill and so on in the human capital management (HCM) applications. But it has been observed that sometimes users intentionally or unintentionally raise a claim, which doesn’t follow their previous claim pattern or claim patterns of their peers. In this blog, we will discuss how to identify these unknown patterns.

Anomaly detection approaches

Anomaly transactions can be detected using the following two approaches:

In machine learning (ML) terms, the first approach is called supervised technique, where data is annotated to indicate fraudulent from normal transactions. A simple classification based model can be trained, and new transactions passed through the trained model to identify the fraudulent or non- fraudulent transactions.

Second one is unsupervised approach and it is more challenging as we do not know which transaction is normal or which one is abnormal. In this approach, the model mines the data and creates two clusters of data, one for non-fraudulent and another for fraudulent transactions.

Here, we will focus how to identify anomalous claim raised by employees with unsupervised deep learning, using autoencoder technique. Deep learning strives to learn from huge, diverse data and solve complex problems intuitively in a way human brain does.

Linear algebra-based Principle Component Analysis (PCA) is powerful unsupervised technique to reduce high dimensional data into lower dimensions and detect the outliers. But why do we need a deep learning- based autoencoder? An autoencoder can perform non-linear transformations of the data using the multiple layers of weights and non-linear activation functions. This technique is more efficient to extract complex abnormal patterns from the data; also, it is very simple but robust.

Autoencoder architecture

An autoencoder is composed with an encoder and a decoder network. The encoder network encodes the data to a code with less dimensions compared to input dimensions and decoder reconstructs the data from the code. If the input and output data are same then there is no anomaly, but if there is some mismatch, there might be chances of having a discrepancy.

Autoencoder in claim outlier (anomaly) detection

Now, coming back to our use case – Claim Outlier Detection. It is very important to process employee claims on timely basis so that no delay happens in claim payment. On the other hand, organizations must keep track on of the claim approval process and conduct validation before payouts to prevent any possible fraudulence.

This autoencoder technique is extremely useful to identify the employees’ claims which are not following the historical claims pattern. The autoencoder network is designed and trained with the historical claims’ records. More training data ensures that the network learns a greater number of patterns. Next the trained model is tested against employee claim data which are already processed as paid out, returned or rejected. Accuracy of the model is determined by checking the percentage of claims marked as anomalous by the model that are really rejected or returned.

The network is then fine-tuned by changing input features or the number of network layers in multiple iterations. Once a satisfactory accuracy level is achieved, the model is deployed in the system, wrapped in an API function and integrated with the HRM through a web service.


Autoencoder-based anomaly detection can be used to raise a flag in HRMS in real time. Once the AI-based claim anomaly detection system is mature enough in terms of reliability and performance, it could replace the legacy manual claim processing system. The flagged transactions can be parked for detailed scrutiny while the rest could be processed straight through. This will help to fast track the claims and help organizations to formulate new claim policy. This deep learning-based approach can minimize the human interactions, bias and improve the payout timing to bring next generation claim approval process.

Satadru Kundu is a Senior Developer with nine years of experience in the Innovation and Product Engineering (IPE) - Analytics group with the Platform Solutions unit at TCS. His area of expertise includes solution design, development and implementation of various AI and Machine Learning use cases for home grown products CHROMATM and TAPTM. He holds a bachelor’s degree in Electronics and Communication Engineering from Netaji Subhash Engineering College, Kolkata, India.