|
重庆邮电大学学报(自然科学版) 2012
Using text data mining techniques for understanding the p53 gene expression regulatory network
|
Abstract:
To study the relationship between p53 gene and its downstream/target genes in order to understand p53 network, text data mining method is used and noncommercial software written in Perl 5.10 is used to mine the database from PubMed about p53 gene and human gene ontology, and the p53 network is constructed by linkage clustering analysis. Results show that the frequency distribution of the objective gene with the gene ontology of all the text has a certain correlation, which indicates that the proportion of the low-frequency genes is significantly lower than the high-frequency genes in text data mining. This has allowed us to demonstrate that the distributions of genes in the p53 network have a greater relationship with the frequency of the genes, and meanwhile, text size has an important influence on the accuracy of the text data mining.