Our group investigates various biological problems at multiple levels of detail in order to gain quantitative understanding in biology. At the microscopic level, we aim to build a solid foundation for quantitative understanding of biomolecular interactions. At the more coarse-grained level, we develop/employ computational approaches with sound statistical foundations to enhance the separation of information from noise in massive biological data sets, thereby paving the way for the discovery of and putting constraints on higher organizational principles in biology. A major goal of our group is to foster a solid connection between medical research and fundamental scientific research.
Molecular Interactions (MI)
Our studies at the microscopic level have concentrated on the most important component in biomolecular interactions, i.e., electrostatics. These studies aim to provide an accurate description of electrostatic interactions among biomolecules. This effort has resulted in a new electrostatics formulation involving complicated dielectric media. This new formulation permits, for the first time, a controllable approximation for the calculation of electrostatic energy and forces1-3. Consequently, one can easily estimate the magnitudes of errors for the quantities computed and one may improve the accuracy as much as one wishes by incorporating more prescribed correction terms in the computation. We are also investigating the quantum mechanical effect that governs molecular bindings and interactions.
Molecular/Information Networks (MN)
The advent of the genomic era has enabled rapid accumulation of information including DNA/protein sequence data, protein/RNA structural data, and biomolecular interaction data. These valuable, and often redundant, data allow researchers to mine relevant information at various organizational levels ranging from determining active sites in protein domains to uncovering relations among functional pathways and even whole cell organization. However, different combinations of these data can also be the common basis of two conflicting claims. To avoid errors introduced through additional annotations, we have developed a method, called information flow, to detect the information transduction modules responsible for propagating information from one node in the network to another. When applied to the protein-protein interaction network, this method illuminates nodes involved in the relevant biological pathways connecting the two specified nodes 4. This framework is also applicable to information filtering in any community network such as recommendation systems5-7. We are currently constructing other means to meaningfully extract important information from a generic interaction network.
Mass Spectrometry, Statistics, and Proteomics (MS)
At a more macroscopic level, we are interested in several topics where robust statistical analyses have been proven valuable. In the realm of sequence alignment, we have worked on improving the statistical accuracy and the retrieval efficiency by various developments8-16. In terms of bioinformatics and proteomics studies, to extract biologically relevant information we have substantially invested our effort in developing useful tools with robust statistical foundation. For example, we have developed computational tools for peptide/protein identification from tandem mass spectrometry (MS/MS) data17-20 and methods to improve statistical significance assignment [cite] in this area. We have also integrated existing knowledge such as protein modifications and their accompanying disease associations with our peptide searches. Our goal in this general direction is to enhance the separation of information from noise in massive biological data sets, thereby putting constraints on higher organizational principles in biology yet to be discovered.
Dr. Yu received a B.S. in Physics from National Taiwan University, and a Ph.D. in theoretical physics from Columbia University. He did his postdoctoral research at the Case Western Reserve University. Prior to his establishing the QMBP Group at the NCBI, he was a tenured Associated Professor at the Florida Atlantic University.
- Yu YK, Hwa T. Statistical significance of probabilistic sequence alignment and related local hidden Markov models. J Comput Biol. 2001;8(3):249-82.
- Yu YK, Wootton JC, Altschul SF. The compositional adjustment of amino acid substitution matrices. Proc Natl Acad Sci U S A. 2003;100(26):15688-93.
- Doerr TP, Yu YK. Electrostatics of charged dielectric spheres with application to biological systems. Phys Rev E Stat Nonlin Soft Matter Phys. 2006;73(6 Pt 1):061902.
- Alves G, Ogurtsov AY, Yu YK. RAId_aPS: MS/MS analysis with multiple scoring functions and spectrum-specific statistics. PLoS One. 2010;5(11):e15438.
- Stojmirović A, Yu YK. Information flow in interaction networks II: channels, path lengths, and potentials. J Comput Biol. 2012;19(4):379-403.
Related Scientific Focus Areas
Biomedical Engineering and Biophysics
This page was last updated on Friday, May 10, 2013