%0 Journal Article %T Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets %A Fangzhou Yao %A Jeff Coquery %A Kim-Anh L¨º Cao %J BMC Bioinformatics %D 2012 %I BioMed Central %R 10.1186/1471-2105-13-24 %X We propose Independent Principal Component Analysis (IPCA) that combines the advantages of both PCA and ICA. It uses ICA as a denoising process of the loading vectors produced by PCA to better highlight the important biological entities and reveal insightful patterns in the data. The result is a better clustering of the biological samples on graphical representations. In addition, a sparse version is proposed that performs an internal variable selection to identify biologically relevant features (sIPCA).On simulation studies and real data sets, we showed that IPCA offers a better visualization of the data than ICA and with a smaller number of components than PCA. Furthermore, a preliminary investigation of the list of genes selected with sIPCA demonstrate that the approach is well able to highlight relevant genes in the data with respect to the biological experiment.IPCA and sIPCA are both implemented in the R package mixomics dedicated to the analysis and exploration of high dimensional biological data sets, and on mixomics' web-interface.With the development of high throughput technologies, such as microarray and next generation sequencing data, the exploration of high throughput data sets is becoming a necessity to unveil the relevant information contained in the data. Efficient exploratory tools are therefore needed, not only to assess the quality of the data, but also to give a comprehensive overview of the system, extract significant information and cope with the high dimensionality. Indeed, many statistical approaches fail or perform poorly for two main reasons: the number of samples (or observations) is much smaller than the number of variables (the biological entities that are measured) and the data are extremely noisy.In this study, we are interested in the application of unsupervised approaches to discover novel biological mechanisms and reveal insightful patterns while reducing the dimension in the data. Amongst the different categories of unsupervised a %U http://www.biomedcentral.com/1471-2105/13/24