|
BMC Bioinformatics 2007
Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithmsAbstract: Using a new definition of molecular contact, small ligands contained in the 2005 PDB edition were identified and processed. The database was enriched in molecular properties. In particular, an automated typing of ligand atoms was performed. A filtering procedure was applied to select a non-redundant dataset of complexes. Data mining was performed to obtain information on the frequencies of different types of atomic contacts. Docking simulations were run with the program DOCK.We compiled a large database of small ligand-protein complexes, enriched with different calculated properties, that currently contains more than 6000 non-redundant structures. As an example to demonstrate the value of the new database, we derived a new set of chemical matching rules to be used in the context of the program DOCK, based on contact frequencies between ligand atoms and points representing the protein surface, and proved their enhanced efficiency with respect to the default set of rules included in that program.The new database constitutes a valuable resource for the development of knowledge-based docking algorithms and for testing docking programs on large sets of protein-ligand complexes. The new chemical matching rules proposed in this work significantly increase the success rate in DOCKing simulations. The database developed in this work is available at http://cimlcsext.cim.sld.cu:8080/screeningbrowser/ webcite.Improving our understanding of protein-ligand interactions at the molecular level plays an important role in the discovery process of new drug candidates. The Protein Data Bank (PDB) [1] is the main source of structural information on protein-ligand complexes. It is constantly being improved through the addition of on-line tools and links to complementary datasets.Current virtual screening methodologies for in-silico discovery process of new leads rely on databases of chemical complexes with structural information and also chemical, physical and biological properties, when
|