|
BMC Bioinformatics 2006
A hierarchical Na?ve Bayes Model for handling sample heterogeneity in classification problems: an application to tissue microarraysAbstract: We propose an extension of the well-known Na?ve Bayes classifier, which accounts for biological heterogeneity in a probabilistic framework, relying on Bayesian hierarchical models. The model, which can be efficiently learned from the training dataset, exploits a closed-form of classification equation, thus providing no additional computational cost with respect to the standard Na?ve Bayes classifier. We validated the approach on several simulated datasets comparing its performances with the Na?ve Bayes classifier. Moreover, we demonstrated that explicitly dealing with heterogeneity can improve classification accuracy on a TMA prostate cancer dataset.The proposed Hierarchical Na?ve Bayes classifier can be conveniently applied in problems where within sample heterogeneity must be taken into account, such as TMA experiments and biological contexts where several measurements (replicates) are available for the same biological sample. The performance of the new approach is better than the standard Na?ve Bayes model, in particular when the within sample heterogeneity is different in the different classes.The biomedical sciences are fraught with uncertainty. The sources of this uncertainty are manifold. Devices used to monitor biological processes vary in terms of resolutions. Gaps in the full understanding of basic biology compound this problem. Biological diversity or heterogeneity may make predictions difficult. Finally, uncertainty may be due to the unpredictable sources of noise, which can be inside or outside the biological system itself.In molecular biology uncertainty is ubiquitous; for example, tissue heterogeneity makes it difficult to compare a tissue sample composed of pure tumor cell populations with one composed of tumor and other non-tumoral elements such as supporting structural tissues (i.e. stroma) and vessels. However, in molecular biology, one rarely can examine an entire tumor and biopsies are taken with the assumption that they represent a portion of t
|