Marc C. Nicklaus, Ph.D.

Senior Scientist

Chemical Biology Laboratory


Building 376, Room 207
Frederick, MD 21702-1201


Research Topics

Computer-Aided Drug Design. The Computer-Aided Drug Design (CADD) Group is a research unit within the Chemical Biology Laboratory (CBL) that employs, analyzes, and develops computer-based methods to aid in the drug discovery, design, and development projects of the CBL and other researchers at the NIH. We split our efforts about evenly between support-type projects and research projects initiated and conducted by CADD staff members. We are implementing many projects, and making available resources developed by the CADD Group, in a Web-based manner. This offers three advantages: (1) it frees all users, including the group members themselves, from platform restraints and the concomitant expenses for specific software/hardware, (2) it makes resources and results immediately available for sharing among all collaborators regardless of their location, and (3) helps, without additional effort, further the mission of the NCI as a publicly funded institution by providing data and services directly to the (scientific) public. The following research areas and projects form part of the portfolio of the CADD Group.

Synthetically Accessible Virtual Inventory (SAVI). Aggregated libraries on the order of 100 million on-the-shelf compounds are available in the commercial market for in silico screening of screening samples for computer-aided drug design. Still, this represents only a microscopically small fraction of the drug-like small-molecule space, estimated to be on the order of 1060 possible structures or even larger. To tap into this huge chemical space, we have created the SAVI database of 1.75 billion compounds predicted to be easily synthesizable. They have been created by a set of transforms based on an adaptation and extension of the CHMTRN/PATRAN programming languages describing chemical synthesis expert knowledge, which originally stem from the LHASA project. The chemoinformatics toolkit CACTVS was used to apply a total of 53 transforms to about 150,000 readily available building blocks from Enamine. Only single-step, two-reactant syntheses were calculated for this database even though the technology can execute multi-step reactions. The possibility to incorporate scoring systems in CHMTRN allowed us to subdivide the database into sets according to their predicted synthesizability, with the most-synthesizable class comprising 1.09 billion synthetic products. Properties calculated for all SAVI products show that the database should be well-suited for drug discovery. It has been made available for free download from

Tautomerism. The CADD Group has been doing significant research on tautomerism, the existence of multiple possible forms of the same molecule that are capable of interconverting via an intramolecular movement of atoms and reconfiguration of bonds. We have collected about 90 different transforms of tautomeric interconversions, comprising prototropic, ring−chain, and valence tautomerism. The majority of these rules were extracted from experimental literature. A web tool has been created to for users apply these rules to their molecules at We have analyzed these rules against an aggregated set of over 400 million (non-unique) structures as to their occurrence rates, mutual overlap in coverage, and recapitulation of the rules’ enumerated tautomer sets by the InChI chemical identifier. These results are the scientific background of IUPAC InChI Project tasked with the redesign of handling of tautomerism for an InChI version 2.

Chemical Identifier Resolver (CIR). CIR works as a resolver for many different chemical structure identifiers (e.g. chemical names, InChI, SMILES etc.) and allows one to convert the given structure identifier into a full structure representation or another structure identifier including references to particular databases in which the corresponding structure or structure identifier occurs. CIR offers a simple to use, programmatic application programming interface (API) based on URLs requested by HTTP. This allows easy linking of CIR and its content to other scientific web services and program packages. CIR currently provides access to 120 million structure records.

Enhanced NCI Database Browser. The Enhanced NCI Database Browser can be used to search the 250,000-compound Open NCI Database. This dataset is the publicly available part of the half-million structure collection assembled by the NCI's Developmental Therapeutics Program during the program's 50+ years of screening compounds against cancer and, more recently, AIDS. Visit the CADD Group's home page or the Enhanced NCI Database Browser service for more information.

Fundamentals of Protein-Ligand Interactions. The non-covalent binding of a drug to the binding site of an enzyme (or other biomacromolecule) is the fundamental process of most drug actions. In spite of a vast body of experimental data available on protein-ligand complexes, mostly obtained by X-ray crystallography, there are still open questions of how this binding process occurs at the atomic and quantitative energetic level. One of the issues is the range of conformational energies one can expect to find for the small-molecule ligand bound to proteins, which we found to be higher than generally assumed. This has led us to broader questions regarding x-ray crystallographic methodologies, such as whether quantum-mechanical refinement (or re-refinement) of protein ligand structures may improve structural quality in various ways.

HIV Integrase. A long-standing interest of our group has been HIV integrase (IN) as a drug development target. This enzyme catalyzes the integration of the viral DNA into the human DNA, which is an essential step in the viral replication cycle. Only a handful of approved drugs so far are based on IN inhibition. We have been utilizing all available experimental results, be they structural, mechanistic, or biochemical, to model and better understand inhibition of IN by small molecules.

Among our main collaborators are Wolf-Dietrich Ihlenfeldt, Xemistry, Germany; Vladimir Poroikov, Russian Academy of Medical Sciences, Moscow; Philip Judson, Lhasa Ltd.; and Raul Cachau, Leidos, FNLCR.


Dr. Nicklaus received his Ph.D. in applied physics from the Eberhards-Karls-Universitat, Tubingen, Germany, and then served as a postdoctoral fellow in the Molecular Modeling Section of the then called Laboratory of Medicinal Chemistry, NCI. He became a staff fellow in 1998, and a Senior Scientist in 2002. In 2000, he founded, and has been heading since then, the Computer-Aided Drug Design (CADD) Group.

Selected Publications

  1. Sitzmann M, Weidlich IE, Filippov IV, Liao C, Peach ML, Ihlenfeldt WD, Karki RG, Borodina YV, Cachau RE, Nicklaus MC. PDB ligand conformational energies calculated quantum-mechanically. J Chem Inf Model. 2012;52(3):739-56.

  2. Sitzmann M, Ihlenfeldt WD, Nicklaus MC. Tautomerism in large databases. J Comput Aided Mol Des. 2010;24(6-7):521-51.

  3. Nicklaus MC, Neamati N, Hong H, Mazumder A, Sunder S, Chen J, Milne GW, Pommier Y. HIV-1 integrase pharmacophore: discovery of inhibitors through three-dimensional database searching. J Med Chem. 1997;40(6):920-9.

  4. Nicklaus MC, Wang S, Driscoll JS, Milne GW. Conformational changes of small molecules binding to proteins. Bioorg Med Chem. 1995;3(4):411-28.

  5. Ihlenfeldt WD, Voigt JH, Bienfait B, Oellien F, Nicklaus MC. Enhanced CACTVS browser of the Open NCI Database. J Chem Inf Comput Sci. 2002;42(1):46-57.

This page was last updated on May 4th, 2021