All Title Author
Keywords Abstract

PLOS ONE  2012 

Systems Biological Approach of Molecular Descriptors Connectivity: Optimal Descriptors for Oral Bioavailability Prediction

DOI: 10.1371/journal.pone.0040654

Full-Text   Cite this paper   Add to My Lib


Background Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. Results The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/?bioavailability) to indicate good-bioavailability/poor-bioavailabilit?ymolecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. Conclusion The logistic algorithm with 47 selected descriptors correctly predicted the oral bioavailability, with a predictive accuracy of more than 71%. Overall, the method captures the fundamental molecular descriptors, that can be used as an entity to facilitate prediction of oral bioavailability.


[1]  Ahmed SS, Ahameethunisa AR, Santosh W, Chakravarthy S, Kumar S (2011) Systems biological approach on neurological disorders: a novel molecular connectivity to aging and psychiatric diseases. BMC Syst Biol 5: 6.
[2]  Li J, Zhu X, Chen JY (2009) Building disease-specific drug-protein connectivity maps from molecular interaction networks and PubMed abstracts. PLoS Comput Biol 5: e1000450.
[3]  Hu G, Agarwal P (2009) Human disease-drug network based on genomic expression Profiles. PLoS ONE 4: e6536.
[4]  Graham RJ, Robert ZH, David TL (2001) Pharmacokinetics and Its Role in Small Molecule Drug Discovery Research. Med Res Rev 21: 382–396.
[5]  Nassar AE, Kamel AM, Clarimont C (2004) Improving the decision-making process in the structural modification of drug candidates: enhancing metabolic stabilit Drug Discov Today 9: 1020–1028.
[6]  Kennedy T (1997) Managing the drug discovery/development interface. Drug Discov Today 2: 436–444.
[7]  Caldwell GW (2000) Compound optimization in early- and late-phase drug discovery: acceptable pharmacokinetic properties utilizing combined physicochemical, in vitro and in vivo screens. Curr Opin Drug Discov Devel 3: 30–41.
[8]  Hou T, Wang J, Zhang W, Xu X (2007) ADME evaluation in drug discovery. 6. Can oral bioavailability in humans be effectively predicted by simple molecular property-based rules? J Chem Inf Model 47: 460–463.
[9]  Ruiz-Garcia A, Bermejo M (2011) In vivo Methods for Oral Bioavailability Studies. In: M Hu, X Li, editors. pp. 493–502. Oral Bioavailability: Basic Principles, Advanced Concepts, and Applications: John Wiley & Sons, Inc., Hoboken.
[10]  Hou T, Wang J, Zhang W, Xu X (2007) ADME evaluation in drug discovery. 7. Prediction of oral absorption by correlation and classification. J Chem Inf Model 47: 208–218.
[11]  Andrews CW, Bennett L, Yu LX (2000) Predicting human oral bioavailability of a compound: development of a novel quantitative structure-bioavailability relationship. Pharm Res 17: 639–644.
[12]  Yoshida F, Topliss JG (2000) QSAR model for drug human oral bioavailability. J Med Chem 43: 2575–2585.
[13]  Agatonovic-Kustrin S (2003) Prediction of drug bioavailability based on molecular structure. Anal. Chim.Acta 485: 89–102.
[14]  Veber DF, Johnson SR, Cheng HY, Smith BR, Ward KW, et al. (2002) Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem 45: 2615–2623.
[15]  Wang JM, Krudy G, Xie XQ, Wu CD, Holland G (2006) Genetic algorithm-optimized QSPR models for bioavailability, protein binding, and urinary excretion. J. Chem. Inf. Model 46: 2674–2683.
[16]  Ma CY, Yang SY, Zhang H, Xiang ML, Huang Q, Wei YQ (2008) Prediction models of human plasma protein binding rate and oral bioavailability derived by using GA-CG-SVM method. J.Pharmaceut. Biomed 47: 677–682.
[17]  Tian S, Li Y, Wang J, Zhang J, Hou T (2011) ADME evaluation in drug discovery. 9. Prediction of oral bioavailability in humans based on molecular properties and structural fingerprints. Mol Pharm 8: 841–851.
[18]  Zhu J, Wang J, Yu H, Li Y, Hou T (2011) Recent developments of in silico predictions of oral bioavailability. Comb Chem High Throughput Screen 14: 362–374.
[19]  Han V, Bernard T (1998) Drug Bioavailability: Estimation of solubility, permeability, absorption and bioavailability. Wiley-vch, weinheim. 435p p.
[20]  Moda TL, Montanari CA, Andricopulo AD (2007) Hologram QSAR model for the prediction of human oral bioavailability. Bioorg Med Chem 15: 7738–7745.
[21]  Varma MV, Obach RS, Rotter C, Miller HR, Chang G, et al. (2010) Physicochemical space for optimum oral bioavailability: contribution of human intestinal absorption and first-pass elimination. J Med Chem 53: 1098–1108.
[22]  Subramanian G, Kitchen DB (2006) Computational approaches for modeling human intestinal absorption and permeability. J Mol Model 12: 577–589.
[23]  Yan A, Wang Z, Cai Z (2008) Prediction of human intestinal absorption by GA feature selection and support vector machine regression. Int J Mol Sci 9: 1961–1976.
[24]  Hou T, Wang J, Li Y (2007) ADME evaluation in drug discovery. 8. The prediction of human intestinal absorption by a support vector machine. J Chem Inf Model 47: 2408–2415.
[25]  Hai PT, Isabel G, Marival B, Victor MS, Inmaculada C, et al. (2011) In Silico prediction of caco-2 cell permeability by a classification QSAR approach. Mol Inform 30: 376–385.
[26]  Paix?o P, Gouveia LF, Morais JA (2010) Prediction of the in vitro permeability determined in Caco-2 cells by using artificial neural networks. Eur J Pharm Sci 41: 107–117.
[27]  Hou TJ, Zhang W, Xia K, Qiao XB, Xu XJ (2004) ADME evaluation in drug discovery. 5. Correlation of Caco-2 permeation with simple molecular properties. J Chem Inf Comput Sci 44: 1585–1600.
[28]  Fujiwara S, Yamashita F, Hashida M (2002) Prediction of Caco-2 cell permeability using a combination of MO-calculation and neural network. Int J Pharm. 237: 95–105.
[29]  Gayathri P, Pande V, Sivakumar R, Gupta SP (2001) A quantitative structure-activity relationship study on some HIV-1 protease inhibitors using molecular connectivity index. Bioorg Med Chem 9: 3059–3063.
[30]  Gupta MK, Prabhakar YS (2006) Topological descriptors in modeling the antimalarial activity of 4-(3',5'-disubstituted anilino) quinolines. J Chem Inf Model 46: 93–102.
[31]  Ahmed SS, Ahameethunisa A, Santosh W (2010) QSAR and pharmacophore modeling of 4-arylthieno [3, 2-d] pyrimidine derivatives against adenosine receptor of Parkinson’s disease. J Theor Comput Chem 9: 975–991.
[32]  Zhou W, Dai Z, Chen Y, Wang H, Yuan Z (2012) High-dimensional descriptor selection and computational qsar modeling for antitumor activity of arc-111 analogues based on support vector regression (SVR). Int J Mol Sci 13: 1161–1172.
[33]  Ooi CH, Chetty M, Teng SW (2006) Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data. BMC Bioinformatics 7: 320.
[34]  Reddy AS, Kumar S, Garg R (2010) Hybrid-genetic algorithm based descriptor optimization and QSAR models for predicting the biological activity of Tipranavir analogs for HIV protease inhibition. J Mol Graph Model 28: 852–862.
[35]  Tetko IV, Gasteiger J, Todeschini R, Mauri A, Livingstone D, et al. (2005) Virtual computational chemistry laboratory–design and description. J Comput Aided Mol Des 19: 453–463.


comments powered by Disqus