AI is considered a game changer in today’s digital world, and many enterprises are in a rush to harness its intelligence and reap benefits. But most of them fail to realize that AI has its own shortcomings. The most common one is the algorithmic bias, where the outcome or decision of an AI model favors or discriminates against a specific class based on a few select attributes that vary by industry.
Causes of bias
The AI model by itself does not have any inherent bias, but the real culprit is the data used to train the model. Bias can originate if the training data is incomplete or if it does not evenly represent the population. This usually happens when one trains a model using data from one region or class or timeframe to predict outcomes or derive insights for another. Bias can also be perpetuated if the training data is contaminated with historical bias.
An insurer using an underwriting model developed in one state to generate insights for another state, using historical crime data to detect current fraud, or using past weather data to predict future climate risk are some typical examples in the insurance industry that could lead to unfair premiums or inaccurate risk prediction.
Bias must be detected first before it can be addressed. Normally, one examines model outputs of different groups to check for anomalies indicating bias. Explainable artificial intelligence (XAI) is another means to understand if any protected attributes are influencing the outcomes. Refer to my article on XAI for more details. One should evaluate the business need and social cost while prioritizing bias over accuracy when there appears to be a trade-off between the two.
AI is used to estimate the mortality or the possibility of an insured getting sick to determine one’s life insurance premiums. The model can exhibit bias for some individuals based on attributes like age, gender, location, name, and country. Similarly, AI is used to estimate the damage in an auto claim based on photographs. It is possible that the model is trained on very few makes and models of vehicles, cannot fairly estimate the claim amount for all vehicles.
In the above scenarios, the model needs to be trained with the right data to eliminate bias. One way is to ensure that we have a balanced training dataset or at least reduce the imbalance in the available dataset. Researchers at Yale have found a means to audit a given dataset and determine the extent of imbalance or skewness with respect to said attributes. This will indicate the possibility of an individual having an undue advantage due to their age, gender, or location. They also recommended a novel data pre-processing step to reduce the imbalance in the training data and thereby mitigate bias.
Personalization or recommendation models
Organizations use these models to recommend products and services to users based on their profiles. Agents tend to recommend products that offer them higher incentives. So, using that data to train a model would be unfair.
Organizations must ensure that recommendation engines don’t exhibit bias in the choice of products for a given individual. Researchers at Yale have come up with a framework that regulates what the model can recommend based on certain constraints imposed by the user. Instead of giving complete freedom, the user dictates his or her needs, forcing the model to make a recommendation bound by those constraints. This mitigates potential bias and increases personalization, leading to better revenues.
This model is commonly used in Insurance. The auto business uses AI and telematics to determine if an insured is a safe driver or not. AI classifies fraudulent financial transactions and claims. The model can be biased toward certain individuals and can unfairly judge them as criminals or ineligible based on few sensitive attributes, like their race, color, location, gender, and age.
Fairness constraints are a means to mitigate this bias. They ensure that individuals with different sensitive attributes stated above have equal probabilities of being classified based on the remaining attributes. This will ensure that the model is not influenced by attributes, while identifying criminals or approving financial transactions.
Bias is a major drawback of AI that originates from training data. Evolving business models and their underlying data will constantly change, making it impossible to build a perfect training dataset. Balanced training datasets and fairness constraints could potentially mitigate bias. The finance industry should be aware of bias while deploying AI in their business operations, and should employ means to detect, correct, and prevent any harm to its customers.