Protein complexes are one of the keys to studying the behavior of a cell system. Many biological functions are carried out by protein complexes. During the past decade, the main strategy used to identify protein complexes from high-throughput network data has been to extract near-cliques or highly dense subgraphs from a single protein-protein interaction (PPI) network. Although experimental PPI data has increased significantly over recent years, most PPI networks still have many false positive interactions and false negative edge loss due to the limitations of high-throughput experiments. In particular, the false negative errors restrict the search space of such conventional protein complex identification approaches. Thus, it has become one of the most challenging tasks in systems biology to automatically identify protein complexes.
We propose a new algorithm, NEOComplex (NECC- and Ortholog-based Complex identification by multiple network alignment), which integrates functional orthology information capable of being obtained by different types of MNA (multiple network alignment) approaches to expand the search space of protein complex detection. As part of our approach, we also define a new edge clustering coefficient to assign weight to interaction edges in PPI networks so that protein complexes can be identified more accurately. The edge clustering coefficient is based on the intuition that there is functional information captured in the common neighbors of the common neighbors as well. Our results show that the algorithm outperforms well-known protein complex identification tools in a balance between precision and recall on three eukaryotic species: human, yeast, and fly. As a result of MNAs of the species, the proposed approach can tolerate the edge loss of PPI networks and even discover sparse protein complexes which have traditionally been a challenge to predict.Supplementary material NEOComplex (NECC- and Ortholog-based Complex identification by multiple network alignment) Download executable files Download dataset © 2016 Cheng-Yu Ma, Yi-Ping Phoebe Chen, Bonnie Berger, Chung-Shou Liao