|
Molecular diversity techniques for chemical databasesKeywords: computer-based methods , chemical databases , biological screening , combinatorial libraries , dissimilarity-based compound selection , molecules , near-maximally dissimilar , combinatorial algorithm , dataset. Abstract: There is much current interest in computer-based methods for selecting structurally diverse subsets of chemical databases, e.g., for inclusion in biological screening programme or for the construction of combinatorial libraries. This paper summarises recent work in Sheffield on dissimilarity-based compound selection, which seeks to identify a maximally-dissimilar subset of the molecules in the database. More realistically, this approach seeks to identify a near maximally dissimilar subset, since the identification of the most dissimilar subset requires the use of a combinatorial algorithm that considers all possible subsets of a given dataset.
|