LBNL
Baylor College of Medicine
Houston Medical School, University of Texas.
Wadsworth Center, NYSDH
National Institute of Health


Program Overview
Home
Log-in
Program Objective
Research Projects
Directors
 


Program Objective


tubulin, building block of microtubuleBroad Aim

The program project focuses on highly parallel data processing technology that will result in (1) greatly increased throughput and (2) significantly higher resolution in electron cryo-microscopy of single particles.  The goal is to ensure that computation is not a major rate-limiting factor in the completion of single-particle projects at better than ~12Å.  One of our objectives is to automate the boxing of particle images so that data sets as large as 105 or 106 particles can be selected with minimal human time and effort. Another objective is to ensure that all of the computational work needed for particle-alignment, 3-D reconstruction and refinement of the reconstruction is routinely completed with turnaround times ranging from a few hours to a few tens of hours.

 It is our ultimate goal that single-particle cryo-EM should routinely produce structures at atomic resolution.  In many cases, atomic-resolution models of large, multi-protein complexes will be obtained by fitting already-known atomic structures of the component proteins into a moderate-resolution, cryo-EM density map of the complex.  The computational technology that we propose is intended to allow the resolution of such EM density maps to routinely extend to ~8-12Å.  In cases where the resolution currently extends to ~8-12Å, the new computational technology would make it possible for  the resolution to further improve to better than~5Å (when
 β-sheet can be easily distinguished from α-helix).  In the future, as the quality of EM images continues to increase, the high-throughput computational technology developed should routinely produce density maps at ~3.5Å (which would allow direct chain-tracing).


Background and Significance

High-resolution electron microscopy has grown to become an important new technology withing the field of structural biology. Very high-resolution images have been obtained for two dimensional crystals, and three-dimensional density maps have been obtained at a resolution of 7-9Å (or even better) for objects with a high degree of internal symmetry.  However, when objects have relatively little or no internal symmetry, a much larger number of single particles must be used to build up signal-to-noise ratio at high resolution. 

The current need for state of the art cryo-EM technology is fueled by the fact that attention in biology is turning more and more to larger structures and assemblies.  Determination of atomic structure of these machines, motors and sub cellular structures is becoming increasingly difficult for x-ray crystallography.  That is why EM is becoming a very important technology in the field of structural biology.  However, though it is encouraging to have such an advanced technology, its rate scientific throughput is quite slow, and the routinely-achieved resolution is not as high as what it can potentially be because the current state of computational technology cannot cope with the task of determining the parameters needed for translational and rotational alignment for the large amount of individual particles needed to achieve this high rate of throughput and high resolution.

This is why the resulting software of the program project would be a great asset to researchers involved in cryo-EM.  Our goal is to develop computational technology that will make it possible to get high-resolution density maps, and to do so from EM images of large, isolated macromolecular particles at a high rate of throughput. We want to develop versions of single-particle software that will take full advantage of modern, affordable, and most importantly highly parallel machine architectures.  The parallel machines are able to process large amounts of data simultaneously, thus completing the computing tasks required for cryo-EM in a realistic amount of time.

Currently, very large data sets are necessary (though not sufficient) in order to achieve high resolution.  about 105 asymmetric units are needed to reach a resolution of ~8-12Å, and roughly 106 particles are needed in order to reach 3-5Å.  Not only is the data set extremely large, but the beginning stages of the reconstruction involve human effort to identify and box images of single particles, which further slows down the process (the rate of this particle selection process is usually a mere 104 particles per day or less).  As the demand for larger data sets grow, this becomes a very big problem. Besides the issue of slow rate, there also exist the problem of determining the relative position and rotational alignment of each particle. 

Our strategy in developing optimized software for processing large amounts of data in a relatively short time, giving high resolution, is to first implement pilot versions of desired code on multiprocessor clusters that are based on commodity PC hardware.  The ultimate goal is to develop the computational technology that will improve both resolution and throughput when calculations are run on machines that are affordable (a) for individual laboratories, (b) as shared instrumentation, or (c) as dedicated machines, run for community as multi-user facilities. 


To learn more about our individual projects, go to Research Projects