%0 Journal Article
%T Verbumculus and the Discovery of Unusual Words
%A Alerto Apostolico
%A Fang-Cheng Gong
%A Stefano Lonardi
%A
AlbertoApostolico
%A Fang-ChengGong
%A StefanoLonardi
%J 计算机科学技术学报
%D 2004
%I
%X Measures relating word frequencies and expectations have been constantly of interest in Bioinfor-matics studies. With sequence data becoming massively available, exhaustive enumeration of such measures have become conceivable, and yet pose significant computational burden even when limited to words of bounded maximum length. In addition, the display of the huge tables possibly resulting from these counts poses practical problems of visualization and inference.VERBUMCULUS is a suite of software tools for the efficient and fast detection of over- or under-represented words in nucleotide sequences. The inner core of VERBUMCULUS rests on subtly interwoven properties of statistics, pattern matching and combinatorics on words, that enable one to limit drastically and a priori the set of over-or under-represented candidate words of all lengths in a given sequence, thereby rendering it more feasible both to detect and visualize such words in a fast and practically useful way. This paper is devoted to the descri
%K Verbumculus
%K unusual words
%K subword statistics
%K pattern discovery
%K regulatory elements
%K suffix trees
特殊字
%K 子字统计
%K 图形发现
%K 单元管理
%K 后缀树
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=F57FEF5FAEE544283F43708D560ABF1B&aid=77C0596D637E9944CFD7AF977D43C6A8&yid=D0E58B75BFD8E51C&vid=2A8D03AD8076A2E3&iid=CA4FD0336C81A37A&sid=2B25C5E62F83A049&eid=2B25C5E62F83A049&journal_id=1000-9000&journal_name=计算机科学技术学报&referenced_num=2&reference_num=52