Abstract—Hydrogen bonds (H-bonds) play a key role in both the formation and stabilization of protein structures. However, H-bonds greatly vary in stability. Different local interactions may reinforce or weaken an H-bond. This paper describes inductive learning methods to train a protein-independent probabilistic model of H-bond stability from molecular dynamics (MD) simulation trajectories. The training data describes H-bond occurrences at successive times along these trajectories by the values of 32 attributes. A trained model is constructed in the form of a regression tree. Experimental results demonstrate that such models can predict H-bond stability quite well. In particular, their performance is roughly 20% better than that of models based on H-bond energy alone. In addition, they can accurately identify a large fraction of the least stable H-bonds in a given conformation. The paper discusses several extensions that may yield further improvements.
Index Terms—Molecular dynamics, machine learning, regression tree.
I. Chikalov and Mikhail Moshkov are with Mathematical and CS & Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia (e-mail: igor.chikalov@kaust.edu.sa).
P. Yao and J. C. Latombe are with Computer Science Department, Stanford University, Stanford, CA 94305, USA (e-mail: latombe@cs.stanford.edu)
Cite: Igor Chikalov, Mikhail Moshkov, Peggy Yao, and Jean-Claude Latombe, "Modelling Hydrogen Bond Stability by Regression Trees," International Journal of Machine Learning and Computing vol. 2, no. 3, pp. 213-218, 2012.