Generative artificial intelligence (GenAI) holds great promise and potential to redefine genetics.
It has the ability to generate novel complex genetic data, enhancing the understanding of genetic disorders and driving advancements in targeted gene therapies and personalized medicine.
The machine learning (ML) capabilities of GenAI are increasingly being employed to design novel gene-editing strategies, generate new protein sequences and de novo pathways, and develop personalized gene therapies—techniques that can revolutionize treatments of genetic disorders such as sickle cell anemia, thalassemia, cystic fibrosis, and mutation-driven cancers. Nevertheless, there are challenges associated with transitioning GenAI approaches from the laboratory to industry and integrating AI models into existing workflows. To make the most of the advances in GenAI for gene therapies, strategies that can help overcome the challenges become imperative.
GenAI algorithms work on pre-existing datasets.
When this is combined with the data-driven nature of genetics, clinical trials, and drug development, what emerges is a promising synergy for gene therapy development.
Target identification and validation
Transformer-based models, together with analytical techniques such as graph neural networks (GNNs) and multi-omics data integration, are increasingly employed to generate novel target hypotheses. Variational autoencoders (VAEs) can integrate multi-omics data to uncover latent genetic patterns, such as mutations or gene expressions, and propose novel genetic targets or sequences that can correct mutations, or modify disease pathways. Generative adversarial networks (GANs) can generate synthetic datasets simulating target activation or inhibition in response to different treatments. This can be used to validate whether the proposed target influences disease progression or molecular pathways.
Gene editing (GED) in gene therapy
CRISPR (Clustered Regularly Interspaced Palindromic Repeats)-Cas technology-based gene editing allows precise genome modifications. Advanced methodologies such as agentic GenAI can design new guide RNA (ribonucleic acid) sequences, which can be optimized for efficiency, specificity, minimal off-target effects, and patient-specific mutations. Clinical trials for cancers, amyloidosis, and hereditary eyesight loss demonstrate promising prospects for CRISPR-based GenAI-driven gene editing.
In practice, there has been tangible and significant progress, with treatment options currently ranging from preclinical to late-stages of clinical development. GenAI has been applied at various stages in the process, including optimization of guide RNA design, engineering of novel proteins, and enhancement of delivery systems in these treatments.
Development of targeted gene delivery vehicles
Despite advancements in gene delivery, the lack of highly efficient viral and non-viral vector systems for successful transmembrane transmission of genes across multiple cellular barriers remains a challenge. The ability of GenAI to evaluate the effectiveness of gene delivery vehicles, optimize gene delivery parameters, and model cells and intracellular organelles is being studied to develop highly efficient and non-toxic gene delivery agents. GenAI is also being explored to understand the physical properties of viral and non-viral carriers and their nucleic acid transfection processes for gene delivery without adverse events associated with traditional transgenesis.
Gene sequence optimization and protein design
Transformer-based models can optimize gene sequences by selecting codons more compatible with the host organisms, thereby improving protein expression and yield. Furthermore, GenAI can design de novo protein sequences that fold into desired 3D structures with specific functions, such as binding or catalysis, or possess properties such as better affinity, thermostability, solubility, or resistance to degradation. Such advances can potentially optimize the design and production, improving efficiency in commercial biomanufacturing.
GenAI algorithms can enhance precision, speed, and efficiency in contrast to manual scientific technologies and offer predictive modeling for better protein sequences.
The possibility of incorporating GenAI technologies into gene therapies offers exciting opportunities.
It, however, also presents complicated technical, financial, ethical, social, and regulatory challenges.
Technical considerations
Sequencing errors, missing annotations, or inconsistent datasets affect the accuracy of generated data and gRNA rankings. Data biases toward well-studied genes and ethnicities can skew outputs. Transformer models based on several parameters are prone to overfitting, making generalization challenging. Models may also struggle to accurately capture biological factors such as cell-specific chromatin accessibility, RNA secondary structures, epigenetic variations, complex biological pathways, potentially resulting in unintended mutations. Even minor inaccuracies in the generated data can have serious implications in gene therapies. These challenges can be overcome by:
Financial considerations
The significant upfront investment, computational infrastructure, specialized software, and expert talent required for AI model development may limit its widespread adoption. Additional considerations include the cost and time involved in collecting and analyzing patient data, genomic profiling and sequencing, integrating GenAI in trial designs, and complying with regulatory standards. Moreover, AI models demand continuous training and updates to remain accurate, necessitating ongoing investments in computational resources and personnel. The cost can be reduced with strategies such as:
Ethical, social, and regulatory considerations
AI applications must address concerns such as patient privacy, transparency in decision-making, and management of algorithmic biases that can perpetuate inequalities in healthcare recommendations for under-represented populations. AI explainability, risk assessment, and human oversight are critical regulatory requirements, without which AI models could remain black boxes, potentially leading to biased or harmful decisions. EU GDPR establishes a right to explanation for AI-based decisions, including risk assessment, human oversight, documentation of decision-making logic, and clear explanations. US FDA AI-ML framework requires AI systems in drug development to be transparent and auditable. Further considerations include:
Tackling challenges is crucial to fully harnessing the potential of GenAI in gene therapies.
This can be done through a well-organized strategy, collaborative efforts, and responsible AI practices from the early stages of drug discovery.
Realizing the full potential of GenAI in gene therapies requires a structured and coordinated approach, beginning as early as drug discovery. Critical to this process are high-quality, diverse training datasets, rigorous validation of generated constructs, cost-balancing strategies, and early engagement with regulatory bodies.
An integrated model that combines domain expertise, business insights, and regulatory strategies will be essential. Ultimately, such a framework can accelerate therapeutic breakthroughs at reasonable costs. With responsible implementation, continued investment, and adherence to ethical AI practices, GenAI can significantly enhance the effectiveness, accessibility, and sustainability of precision medicine.
The authors express their gratitude to Dr. Sonali Sahu and Mr. Rajaram Rane for their insights on the scientific aspects, and to Mr. Kamlesh Mhashilkar for his input on GenAI, all of which supported the development of this article.