S. Cenk Sahinalp, Ph.D.

Senior Investigator

Cancer Data Science Laboratory


Building 10, Room 6N119
Bethesda, MD 20892



Research Topics

A key focus of my lab is discovery and interpretation of large-scale (especially structural) genomic and transcriptomic variants in tumor samples. Our algorithmic methods for genomic structural variation discovery, including VariationHunter, CommonLAW, DeStruct and NovelSeq, were the first with the ability to handle novel insertions, deletions, inversions and duplications in repeat regions of the human genome. More recently I have been interested in applying the algorithmic technology we developed for structural variant discovery to exact genotyping high copy number, structurally variant genes, e.g., those involved in drug metabolism – for which my group has developed Cypiripi and Aldy methods. My group has also contributed to the identification and quantification of transcriptomic aberrations, in particular gene fusions, as well as genic inversions, duplications and deletions in cancer samples. Leading computational methods that we developed include DeFuse, NFuse, Comrad MiStrVar and SVICT – the last one with the ability to handle circulating cell-free tumor DNA data. My recent interests include modeling tumor evolution and heterogeneity through both bulk and single-cell sequencing (CITUP, CTP-Single, Remix-T and BSCITE) and network-aided, integrative analysis of genomic and transcriptomic sequence data from tumor samples (Hit’nDrive and cdCAP). Finally, I have an ongoing interest in what I would like to call “algorithmic infrastructure” for genomics, including (i) mapping (of especially reads from repetitive regions of the genome – or involving reads with high error rates – examples include mrFAST, mrsFAST, drFAST and lordFAST),  (ii) genomic data compression (SCALCE, DeeZ and AssemblTrie) and (iii) secure/privacy preserving computing (PrivStrat and SkSES).


S. Cenk Sahinalp received his B.Sc. in Electrical Engineering at Bilkent University, Ankara, Turkey and his Ph.D. in Computer Science from the University of Maryland, College Park. His Ph.D. research was on parallel and serial algorithms for string/sequence processing. After a brief postdoctoral fellowship at Bell Labs, Murray Hill, he worked as a Computer Science professor at the University of Warwick UK, Case Western Reserve University, Simon Fraser University (where he was a Canada Research Chair in Computational Genomics and a Senior Scientist the Vancouver Prostate Centre) and most recently at Indiana University, Bloomington.

Sahinalp’s research has focused on combinatorial algorithms and data structures (especially for strings/sequences) and their applications to biomolecular sequence analysis, especially in the context of cancer. In the past decade, his lab has developed several algorithmic methods for efficient and effective use of high-throughput sequencing data for better characterization of the structure, evolution and heterogeneity of cancer genomes. He has (co)trained more than two dozen Ph.D. students and postdocs, many of whom now hold independent academic and research positions in the U.S., Canada and elsewhere. He is also actively engaged in the computational biology community, having organized RECOMB 2011 in Vancouver, BC, chairing the program committee of RECOMB 2017 in Hong Kong, founding RECOMB-Seq, and currently serving on the steering committee of RECOMB.

This page was last updated on March 5th, 2020