%0 Journal Article
%T grDNA-Prot:基于氨基酸物理化学特性和支持向量机的DNA结合蛋白预测
grDNA-Prot: The Prediction of DNA-Binding Proteins Based on Physicochemical Properties of Amino Acids and Support Vector Machine
%A 张艳萍
%A 倪建威
%A 高雅
%A 陈鹏丞
%A 李旭涛
%J Hans Journal of Computational Biology
%P 1-11
%@ 2164-5434
%D 2021
%I Hans Publishing
%R 10.12677/HJCB.2021.111001
%X DNA结合蛋白在细胞内外的各种活动中起着重要作用。本文提出一种新的DNA结合蛋白预测方法(grDNA-Prot),使用20个氨基酸组成频率和基于AAindex数据库531个氨基酸物理化学性质的图形表示法描述蛋白质序列信息。此外,还采用三种特征选择方法来选择最优特征,并通过5折交叉验证,建立了基于支持向量机的DNA结合蛋白识别预测模型。为验证该方法的有效性,本文在独立测试数据集上与其他方法进行了比较。这些结果表明,Hydrophobicity (H)、Physicochemical properties (P)和Alpha and turn properties (A)是有效区分DNA结合蛋白和非DNA结合蛋白的主要氨基酸物理化学性质。
DNA-binding proteins played an important role in various intra- and extra-cellular activities. In this paper, a novel grDNA-Prot method of DNA-binding predictor is proposed, the protein sequence in-formation is described with the probabilities of 20 amino acids and the 531 physicochemical prop-erties indices of 20 amino acids in AAindex database based on the Cylindrical graphical representa-tion. Furthermore, we employ three feature selection methods to select the optimal feature, which is used to establish the model for identify DNA-binding proteins basing on support machine vector with 5-fold cross-validation. In order to test the effectiveness of our method, we compare the accu-racy performance with the other methods in independent test dataset. These results demonstrated that the physicochemical properties of hydrophobicity (H), Physicochemical properties (P) and the alpha and turn properties (A) are primarily responsible for distinguishing between DNA-binding proteins and non DNA-binding proteins.
%K DNA结合蛋白,物理化学性质,图形表示法,特征选择,支持向量机
DNA-Binding Proteins
%K Physicochemical Properties
%K Graphical Representation
%K Feature Selection
%K SVM
%U http://www.hanspub.org/journal/PaperInformation.aspx?PaperID=41110