%0 Journal Article %T TF-Cluster: A pipeline for identifying functionally coordinated transcription factors via network decomposition of the shared coexpression connectivity matrix (SCCM) %A Jeff Nie %A Ron Stewart %A Hang Zhang %A James A Thomson %A Fang Ruan %A Xiaoqi Cui %A Hairong Wei %J BMC Systems Biology %D 2011 %I BioMed Central %R 10.1186/1752-0509-5-53 %X We developed a computational pipeline called TF-Cluster for identifying functionally coordinated TFs in two steps: (1) Construction of a shared coexpression connectivity matrix (SCCM), in which each entry represents the number of shared coexpressed genes between two TFs. This sparse and symmetric matrix embodies a new concept of coexpression networks in which genes are associated in the context of other shared coexpressed genes; (2) Decomposition of the SCCM using a novel heuristic algorithm termed "Triple-Link", which searches the highest connectivity in the SCCM, and then uses two connected TF as a primer for growing a TF cluster with a number of linking criteria. We applied TF-Cluster to microarray data from human stem cells and Arabidopsis roots, and then demonstrated that many of the resulting TF clusters contain functionally coordinated TFs that, based on existing literature, accurately represent a biological process of interest.TF-Cluster can be used to identify a set of TFs controlling a biological process of interest from gene expression data. Its high accuracy in recognizing true positive TFs involved in a biological process makes it extremely valuable in building core GRNs controlling a biological process. The pipeline implemented in Perl can be installed in various platforms.Identifying the TFs potentially involved in a biological process is critical to unveiling regulatory mechanisms. Examples of the importance of identifying a small list of potentially crucial transcription factors include reprogramming somatic cells to a pluripotent state [1,2], the transdifferentiation of cells via forced TF expression [3] and genetic engineering of plants for increased productivity and adaptability[4]. Except for TF-finder [5], there is currently no methods or software specifically tailored to identifying TFs from expression data. Although some very well-performing network construction methods, for instance, CLR [6], NIR[7] and ARACNE [8], can be used to identify TF %U http://www.biomedcentral.com/1752-0509/5/53