LIFESCIENCES PULSE

Supervised & Unsupervised Learning for ICSR Processing

 
February 4, 2019

In the previous three blogs (AI in Pharmacovigilance - Re-imagining Patient Safety, AI in Pharmacovigilance — Re-imagining the ICSR processing and Assessing Product Complaints and Ensuring Vigilance in the Pharma Industry Using Semantic Search), we discussed the process of pharmacovigilance (PV) being aided by artificial intelligence (AI). Based on various ontologies, business rules, and the corpus of data used for the training, an AI-based PV solution is capable of processing various structured and unstructured PV documents such as medical literature, CIOMS safety reports, and patient reports for the Individual Case Study Report (ICSR). Such processing involves not only the extraction of correct entities from the textual information, but also understanding of the information provided in the context of the case pertaining to entities such as suspect product and indications.

For instance, the below example demonstrates the capability of AI solutions in identifying the patient-related sections of a narrative.

One major problem with any kind of AI-based supervised learning is that a large amount of annotated data is required, and this data generally has limited availability. The cost of annotation using medically trained experts is also very high. As such, the industry is fast moving towards a semi-supervised approach where a part of data is annotated for medical events and training. The trained model is used to annotate from unknown data. This is repeated until there is enough data to train a complete model.

Toward Automated Reasoning with Knowledge Representation

One of the most critical aspects that differentiate AI-based systems from other conventional options is knowledge representation. In this aspect, the advancements in the field of unsupervised information extraction of late have seen comparatively more traction. Application of semantic and syntactic text-representations and contextualization of information are key contributors for successful deep learning algorithms and their state-of-the-art extraction outcomes. Deep learning remains a probabilistic approach, which has regulatory challenges in the field of PV. Hence, to ensure that a machine’s learning process and its application remains detectable, justifiable, and sustainable, attempts are being made to extract ‘rules’ from the process and re-apply the same to facilitate information extraction. These rules remain as the record of the machine’s learning for all future use and references. Subsequent to the rules, use of ranking algorithms can help in prioritizing, ranking, and selecting the most useful rules applicable to a given test scenario while implementing the same in the data-extraction process. This approach entails the transformation of the probabilistic process into a deterministic one and therefore, it is more compliant throughout the entire PV process.

Another important aspect of deep learning is the textual corpus. The AI solution used in the PV process is trained for automatically mining a variety of clinical corpuses from millions of published literature and customer-provided documents to learn and build extraction rules. This is why rule-based extraction is of special interest to the industry. The most distinct aspects that make this approach effective are the complete transparency of the decision-making process, and the traceability or justification of the decision so that regulatory compliance is not a major challenge.

Stepping into an Intelligent Future

As this blog series comes to a close, we reiterate the conclusion of the entire series here.

Continuously increasing volume of data and the regulatory requirements are disproportionate to the availability of trained human resources in the pharmacovigilance domain. The scenario may be more inverse in the case of other vigilance segments such as medical devices and cosmetics. Hence, automation using artificial intelligence for use cases like case extraction is the need of the hour. And, the systems using AI need to be strong and traceable to be accepted by the regulators.

The intention of AI in the entire process is not to replace human intelligence but to support human decision-making. AI needs to make such processes smoother and faster by amalgamating machine learning with human learning to simplify decision-making processes. The traceable, repeatable outcomes will give comfort of acceptance to the regulators as well.

While it is said that database is a goldmine in the coming future, it is also important to have adequate accessibility to correct data for benchmarking the rules or ontologies mined out of such data. For better accuracy and sustained or improved outcomes, large volumes of production data are required. In view of this, we have collaborated with our customers to meet the data requirements for such volumes for training and testing purposes.  Further ontologies contributed by experts have improved accuracy. This helped us demonstrate that such an approach could solve the regulatory hurdle with mined rules & ontologies being approved before production deployments. In addition, apart from providing large data for training ground implementation of AI solutions in PV, the next long-term goal could be to standardize a process, so that the AI solution becomes a plug-and-play system that can be integrated with any safety database.

 

Dr. Ashish Indani is a qualified physician with an MBA in Clinical Trial Management and MIRCS (Research Methodology) and is working as a Domain Consultant for Advanced Drug Development (ADD) Platforms, Life Sciences unit, at Tata Consultancy Services (TCS). With a vast experience of more than 18 years in the domain of clinical research, Dr Ashish is an expert on medical devices. He has authored many publications and books on diverse subjects, primarily medical devices, clinical research methodology, different therapeutic areas, and ancient Indian Vedic literature. In his current role, Dr. Ashish is working with the TCS Life Sciences ADD Platforms, for the development of innovative technology solutions with use of artificial intelligence and other modern technologies across various life sciences operations.