Extending the genetic code: identifying new DNA bases



While we all learn about DNA’s four basic components — the bases adenine (A), thymine (T), guanine (G), and cytosine (C) — many organisms have small changes to their DNA’s molecular structure that result in variations of these building blocks. One of these modified bases, 5-methyl cytosine (5mC), has long been known to exist in mammalian DNA, but it is removed from cells when eggs and sperm are created and come together to create a fertilized embryo. It remained unknown how this process and other changes to the traditional DNA bases occur.


IRP researchers led by senior investigator L. Aravind, Ph.D., used a combination of comparative genomics, genetic sequencing, and structural analyses to identify a diverse set of previously unrecognized enzymes that cause various changes in the base molecules of DNA.


This work resulted in the discovery of several new DNA bases that encode a new layer of information over and beyond that specified by the four classical bases A, T, G, and C. This ‘epigenetic’ information has profound implications for understanding normal development and how cells take on specialized functions, as well as certain cancers and other diseases.


Iyer LM, Zhang D, Aravind L. Adenine methylation in eukaryotes: Apprehending the complex evolutionary history and functional potential of an epigenetic modification. Bioessays. 2016 Jan;38(1):27-40.

Iyer LM, Zhang D, Burroughs AM, Aravind L. (2013). Computational identification of novel biochemical systems involved in oxidation, glycosylation and other Complex modifications of bases in DNA. Nucleic Acids Res. Sep;41(16):7635-7655.

Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, Agarwal S, Iyer LM, Liu DR, Aravind L, Rao A. (2009). Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science. May 15;324(5929):930-935.

View All Health Topics

This page was last updated on Tuesday, June 13, 2023