Fast and Reliable Analysis of Molecular Motion (DPES-ScIMAP)
The analysis of molecular motions starting from extensive sampling of molecular configurations remains an important and challenging task in computational biology. Existing methods require a significant amount of time to extract the most relevant motion information from such data sets.
This work builds upon ScIMAP (Scalable Isomap) method [Das et. al], which, by using proximity relations and dimensionality reduction, has been shown to reliably extract from simulation data a few parameters that capture the main, linear and/or nonlinear, modes of motion of a molecular system. The application of ScIMAP to analyze motions of large molecules remains however computationally expensive due to the dependency of ScIMAP on exact nearest neighbors.
The main contribution of this work is to provide a practical tool for molecular motion analysis that significantly reduces the computational bottleneck of ScIMAP while maintaining its overall accuracy. The proposed DPES-ScIMAP (Distance-based Projection onto Euclidean Space ScIMAP) method, motivated by DPES is based on the idea of projecting the molecular configurations onto a low-dimensional Euclidean space and computing approximate nearest neighbors in the projected space.
Results on the characterization of protein folding reactions reveal that the folding landscapes emerging from the application of DPES-ScIMAP and ScIMAP are practically indistinguishable. The extracted reaction coordinates identify important features of the folding landscape including the folded and unfolded states, transition-state ensemble, and in the case of CV-N, on-route intermediate ensembles.
The advantage is that, in many instances, by using DPES-ScIMAP instead of ScIMAP, the computational time required to analyze the simulation data is reduced from several CPU months to just a few CPU hours.
- Plaku E, Stamati H, Clementi C, and Kavraki LE (2007): "Fast and Reliable Analysis of Molecular Motion Using Proximity Relations and Dimensionality Reduction." Proteins: Structure, Function, and Bioinformatics, vol. 67(4), pp. 897--907 [publisher] [preprint]
- Plaku E and Kavraki LE (2008): "Quantitative Analysis of Nearest-Neighbors Search in High-Dimensional Sampling-Based Motion Planning." Springer Tracts in Advanced Robotics, vol. 47, pp. 3--18 [publisher] [preprint]
- Plaku E and Kavraki LE (2007): "Nonlinear Dimensionality Reduction Using Approximate Nearest Neighbors." SIAM International Conference on Data Mining, pp. 180--191 [publisher] [preprint]