Publish in OALib Journal
APC: Only $99
tertiary structure is indispensible in revealing the biological functions of
proteins. De novo perdition of
protein tertiary structure is dependent on protein fold recognition. This study
proposes a novel method for prediction of protein fold types which takes primary
sequence as input. The proposed method, PFP-RFSM, employs
a random forest classifier and a comprehensive feature representation, including
both sequence and predicted structure descriptors. Particularly, we
propose a method for generation of features based on sequence motifs and those
features are firstly employed in protein fold prediction. PFP-RFSM and ten
representative protein fold predictors are validated in a benchmark dataset
consisting of 27 fold types. Experiments demonstrate that PFP-RFSM outperforms
all existing protein fold predictors and improves the success rates by 2%-14%.
The results suggest sequence motifs are effective in classification and
analysis of protein sequences.