%0 Journal Article
%T The Sample-Seperators Based Distributing Scheme of the External Bucket Sort Algorithm
桶外排序算法的抽样分点分发策略
%A YANG Lei
%A HUANG Hui
%A SONG Tao
%A
杨磊
%A 黄辉
%A 宋涛
%J 软件学报
%D 2005
%I
%X Two ways to sort externally are Multi-Line Merging Sort and Bucket Sort, both with two passes. The Bucket Sort burdens the CPU less and is more efficient, while its usage is restricted heavily by the High-Bit scheme that distributes records into subfiles: the keys have to be integers; the sizes of subfiles may vary too much; the number of subfiles cannot be chosen freely. Based on statistical theory, this paper presents a sample-seperators scheme to broaden the ussage of bucket sort algorithm. A brief discussion on the convergance of sample-seperator estimation is given and the probability to avoid memory overflow is calculated. This scheme enables the bucket sort algorithm to be applied in the SheenkSort system to win the 2003 PennySort (the Indy category) competition.
%K external sort
%K bucket sort
%K multi-line merging
%K distributing scheme
%K sample-separators
%K PennySort
外排序
%K 桶排序
%K 多路归并
%K 分发策略
%K 抽样分点
%K PennySort
%U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=FDAB523CE84C22F6&yid=2DD7160C83D0ACED&vid=7801E6FC5AE9020C&iid=94C357A881DFC066&sid=5DD21DF25EF52D4A&eid=7004BE6E41AAF52C&journal_id=1000-9825&journal_name=软件学报&referenced_num=3&reference_num=14