Abstract—To predict protein-protein docking sites in a
massive protein dataset, we built a cloud computing based
computing pipeline. This pipeline conforms to Elastic
MapReduce. The implementation of this pipeline includes three
components. First, the cloud computing is based on the
application of the open source hadoop platform. Second, the
pipeline combines several existing protein-protein docking site
methods. Third, the pipeline takes advantage of network
computing resource to predict protein-protein docking sites by
distributed data processing services. The results show our
method can highly improve the performance of protein-protein
docking site prediction.
Index Terms—Cloud-computing; protein docking site;
pipeline.
The authors are with the Department of Systems and Computer Science,
Howard University, Washington, DC 20059, USA. (e-mail:
hli@scs.howard.edu; jeanclaudetounkara@gmail.com;
chunmei@scs.howard.edu.).
Cite:Hui Li, Jean-Claude Tounkara, and Chunmei Liu, "Prediction of Protein-Protein Docking Sites Based on a Cloud-Computing Pipeline," International Journal of Machine Learning and Computing vol.2, no. 6, pp. 798-801, 2012.