Protein Docking by a Stochastic Optimization Method on the
Euclidean Group Using Semi-Definite Underestimation

Yang Shen

Graduate Program in Systems Engineering, Boston University

e-mail: yangshen@bu.edu

Ioannis Ch. Paschalidis

Center for Information and Systems Engineering, Department of Manufacturing Engineering, and Department of Electrical and Computer Engineering, Boston University, 15 St. Mary’s Street, Brookline, MA 02446

e-mail: yannisp@bu.edu, url: http://ionia.bu.edu/.

Pirooz Vakili

Center for Information and Systems Engineering, and Department of Manufacturing Engineering, Boston University

e-mail: vakili@bu.edu

Sandor Vajda

Department of Biomedical Engineering, Boston University, 44 Cummington Street, Boston, MA 02215,
e-mail: va jda@bu.edu

Proteins interact with each other and other chemical entities in the cell to form complexes. These interactions play a central role in a number of vital functions of the cell such as metabolic control, gene regulation and signal transduction. Meanwhile, many drugs are designed to enhance or suppress the functions of target proteins. Hence, one of the fundamental and challenging problems in computational structural biology is the protein docking problem: determining the 3-dimensional structure of a bound complex in atomic detail from the atomic coordinates of the unbound component molecules.

Formulated as an optimization problem, the final stages of protein docking can be viewed as optimizing a very noisy funnel-like function of binding Gibbs free energy. The conformational space is the space of rigid body motions, i.e., the (special) Euclidean group SE(3), with side chain flexibility. All previous approaches use either systematic sampling or Monte Carlo, neither exploiting the specific shape of the free energy surface. To that end, we have recently introduced a stochastic global optimization method, called Semi-Definite programming based Underestimation (SDU), that constructs a convex quadratic under-estimator to the funnel-like free energy based on a sample set of energy function evaluations and uses the quadratic under-estimator to guide future sampling. We find that the parameterization of SE(3) has a significant impact on the shape of the binding energy funnels and the resulting efficiency of SDU. We introduce a parameterization inspired by the way complexes associate in nature. Our algorithm reduces the number of the costly energy function evaluations by more than 20 times, compared to state-of-the-art Monte Carlo-based algorithms for the same purpose. In addition to the methodology, we will present applications of the algorithm to a benchmark set of protein complexes and the recent targets from CAPRI (Critical Assessment of PRediction of Interactions), the first community-wide docking experiment.