Skip to main content
Skip to footer
We're taking you to another TCS website now.


 


HIGHLIGHTS

  • The human mind is not yet comfortable trusting systems that decide without letting us into the logical reasoning behind those decisions.
  • Trustworthy AI must be explainable, unbiased, transparent, reproducible, and sustainable.
  • An AI model must be able to explain the right thing to the right person in the right way, and at the right time.
  • Here’s an example of explainable AI implemented in an insurance premium calculating model.

t

 

Are AI decision-making models trustworthy?

Artificial Intelligence (AI) systems are increasingly being entrusted with making critical decisions. Many of these decisions have a considerable impact not only on businesses, but on individuals as well. The evolution of deep learning has resulted in a significant increase in the accuracy of these decisions.

However, the human mind is not yet comfortable with trusting systems that decide without letting us into the logical reasoning behind such decisions. Similarly, there are concerns regarding fairness and bias in AI systems due to data based on which the models are trained or on the modeling process itself. This lack of insight into the decision-making process along with doubts regarding fair and unbiased decisions are some of the main hindrances when it comes to accepting AI at scale. 

 

Pillars of trustworthy AI

Trustworthiness of an AI model encompasses the attributes that improve the notion of trust and ethics in these systems.

pace

Humans can be reluctant to trust AI-based decisions unless the models can explain the decision-making process; take unbiased decisions; reproduce the decision; be transparent about their performance under different conditions; and be optimized to reduce the carbon footprint.

 

Explainability:

Explainability for AI systems has taken a center stage in policy debates across research, business forums, and regulatory bodies. In all these discussions, the expectations from the explanations are that they should provide explainability, interpretability, transparency, and contestability.

The most prevalent form of explanations in the industry today are feature importance and saliency maps. Various techniques are available to generate feature importance or relevance for a specific decision or for global behavior of the model.

Likewise, the ability to provide counterfactual explanation to end users or individuals about how an unfavorable outcome can be converted to a favorable one is crucial for trustworthy AI systems.

Bias and fairness:

Biases can creep into modeling processes at various stages and in various forms. The training data used might have inherent bias due to historical reasons; or bias can get into modeling if data sampling is not uniform across different classes or does not represent different groups fairly.

Reproducibility:

The ability to reproduce the entire AI model development cycle instills trust in the overall AI solution and encourages AI adoption. Documenting steps such as data processing, model training and tuning, and model testing and validation, allows us to replicate the model development, as well as helps lower risks, and trace and reproduce bugs.

Transparency:

Factors such as publishing details of models in the decision system, information about how they work together to make the final prediction, and insights on how the model performs under different conditions, are vital. Along with the expected behavior of the model, like variability of the outcomes under different conditions, is critical for end users. This aspect of transparency adds credibility to AI deployment.

Sustainability:

Developing AI solutions could lead to a large carbon footprint due to the processing of large volumes of data, use of large compute instances, and the energy needed to cool such data centers. There is a need to optimally use resources, monitor resource consumption, and optimize AI solutions to reduce the carbon footprint and be sustainable.

 

Lack of understanding and ways to bridge the gap 

Most of the explanations for AI model predictions are in numeric values, force plots and graphs, saliency, or heat maps, which are understood only by data scientists, and mostly remain opaque to end users. This results in lack of understanding and the inability to act on AI decisions and increases reluctance in consuming the AI outcomes.

Communicating the explanations:

It is vital to communicate the explainer outputs to end users, business stakeholders, and regulators in simple form and language. An explanation in human perceivable form can go a long way in making AI decision far more consumable.

Here’s an example: Consider an AI system that predicts medical insurance premiums. A chat interface to this system not only provides a mechanism to interact in natural language but also provides an explanation in human perceivable form. A user can inquire about their premium charges, where an AI model predicts and communicates the outcome in the chat. Further, the user can contest and demand an explanation for higher medical premium. The force plot explanation provided by known explainers is then translated and relayed back to user in natural language.

For instance, a message such as this would be displayed: “The premium charges are higher because your age (51 years) and weight (91 kg) are on the higher side, and you had one major surgery in the recent past.”

Generating and communicating counterfactuals

The ability of the system to recommend a minimum change that is actionable and achievable and allows one to change an unfavorable decision into a favorable one, is equally critical in establishing trust in AI systems. In case of the medical insurance premium prediction, the ability to recommend to the user that reducing their weight by five to seven kilograms or giving up smoking can significantly reduce the premium charges, highlights the transparency of the decision system.

Uncertainty quantification with explanations

Uncertainty quantification is critical in AI modeling, especially in domains like financial services, life sciences, and healthcare, where variability of prediction or distributions of outcomes is far more important compared to a point prediction. The uncertainty quantification not only tells us the model behavior but also points out gaps in data that would lead to higher variability in model predictions.

Computing the model uncertainty with respect to the important features identified by the explainer provides meaningful insights into the overall model behavior, including model prediction variability. It also identifies the gap in training data.

This is illustrated with an example of medical insurance premium estimator model with the help of a SHAP (SHapley Additive exPlanations) explainer. 

pace

The SHAP explainer identifies age, weight, and previous transplant surgeries as most influential factors in predicting premium charges. It also takes into account the patient’s height, known issues such as allergies, diabetes, and chronic diseases, and family history of cancer and other problems, while calculating the premium amount.

 

Such mechanisms would enable us to provide an analytic conclusion such as, “Age is the most influential factor in predicting medical insurance premium. However, model uncertainty is high for the age group (29-39 years) and high variability is expected in model predictions for this age group. Augmenting the training data for the age group (29-39 years) would help lower uncertainty.”

Bias mitigation:

Bias mitigating techniques such as reweighing and sampling allows us to remove bias in preprocessing stage by calibrating data, while techniques such as adversarial debiasing which is an in-processing or in-training method, allows us to make predictions free from carrying any information which can be exploited by an adversary for group discrimination.

Several effective metrics are available to measure bias, such as the disparate impact ratio that gives us a fair view of distribution of favorable decisions between two groups—underprivileged and privileged. A disparate impact ratio closer to one indicates an unbiased system and goes a long way in building faith into AI system.

pace

In the above figure, the disparate impact ratio of 0.448 indicates that there is a significant gender bias favoring male applicants for loan approval, which is mitigated by applying reweighting technique and indicated by a disparate impact ratio equal to 1.

 

It is recommended that bias is managed in all stages of modeling. In the data pre-processing and profiling stage, protected attributes can be detected automatically, and reweighing and resampling techniques can be applied to overcome the historic or representation bias in the data. Likewise, bias in the modelling needs to be identified in the testing and assurance stages, before deployment.

Gaining trust in AI decisions

As the usage of AI systems picks up pace, businesses, while implementing the technology, must be wary of the decision-making models’ responsibility of keeping end users informed of every step, and provide necessary explanations as needed.

Context and timing are key in AI-based decision-making models. The best way for users to gain trust in AI decisions is by maintaining transparency, relaying explanations, uncertainty, and bias to them in a humanly perceivable way with domain context.