Enterprise Solutions

Next Generation Genome Sequencing

The Human Genome Project, an effort supported by a global consortium of scientists, demystified the human genetic code by producing a comprehensive blueprint of a human being's genetic makeup. It has also resulted in rapid advances in Next Generation Sequencing (NGS) technologies, producing a flood of data

TCS Advantage | Our Solution | Benefits

To allow researchers to analyze and mine genomic data with significantly more speed, TCS offers an Accelerated NGS Data Analysis Platform for building automated analysis pipelines - powered by the lightning-fast SAP HANA platform.

The TCS Advantage

  • As one of an elite group of SAP Global Solutions Partners, TCS delivers solutions that address the strategic, tactical and operational aspects of the life sciences supply chain.
  • With more than two decades of experience in working with global life sciences companies in diverse geographies, TCS helps organizations in their transformation journey, leveraging its people, platforms, products and services across the value chain.
  • TCS' R&D labs, Technology Excellence Groups (TEG) and Process Excellence Groups (PEG) constantly work in collaboration with project teams to provide better solutions and deployments. Customers also benefit from our Co-Innovation™(COIN) labs, which bring the best minds together.

Our Solution

Working with the Center for Computational Biology at the University of California at Berkeley, TCS has developed new methods for the rapid interpretation of genome variation data. The Next Generation Sequencing Analysis platform, built on SAP HANA, automates parts of the read assembly, variant calling and variant annotation processes using a pipeline-based approach to clean data and dramatically speed up analysis processes. Our solution includes:

  • Automated Read Assembly: The read assembler generates read assembly maps of single patient samples as Storage Assembly Management (SAM) files. The read-assembly process deploys an ultrafast read aligner to align short DNA sequences, or reads, to the baseline data in the Human Genome. The read-assembly process is run on high-speed Hadoop clusters, a stable, open-source technology designed to manage large data sets.
  • Variant Calling: The variant calling process is enabled by ’R’, another stable, open-source technology widely used among statisticians and data miners. The platform comes with custom-developed R programs to analyze SAM files and extract variations in the patient or individual genome sequence.
  • Variant Annotation: The variant annotation process maps variant calls against all known conditions and diseases to define patterns and trends. The annotation uses 11 different sources of data for this purpose, and we are adding more data sources.
  • SAP HANA Platform: The key to the solution is the lightning-fast computing power made  possible by SAP HANA’s in-memory database. It consolidates and stores comprehensive read assembly data (SAM data) and variant calls with annotation data, which is then accessed through other programs and R scripts for further analysis and reporting. It allows DNA testing and analysis that could take days, to be processed in minutes.
  • SAP Business Objects Data Services: By integrating genomic data into powerful business intelligence tools, such as SAP Business Objects, researchers are able to gain detailed biological insights into their DNA data.
  • Hadoop: A Hadoop distributed computing framework provides the effective and easy to use MapReduce method in parallelization for many bioinformatics data analysis algorithms.                                                                                                     


  • This powerful in-memory analytics solution is a key enabler for life sciences, including pharmaceuticals and diagnostics using NGS technologies to identify new markers that will be useful in clinical trials and diagnostics.
  • Automates clinical interpretations of patient/individual genetic variations through automated annotation and reporting.
  • Enables scientists to identify new markers within a patient population.
  • Reduces querying lag when a researcher queries part of input data or ancillary public/prior annotation data.
  • Reduces time delays in exploratory research through faster response.
  • Reduces the total time taken for analysis.

Reach Us.