全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

一种基于RoughSet的海量数据分割算法*

, PP. 249-256

Keywords: 粗糙集,数据分割,分布式处理

Full-Text   Cite this paper   Add to My Lib

Abstract:

处理海量数据一直是数据挖掘要解决的一个重要问题.目前已有许多并行或串行的算法来处理海量数据,然而这些算法通常都不能很好地解决速度和正确率之间的矛盾.分布式运算在处理数据上具有明显优势,因此本文考虑将一个原始的海量数据集分割成许多个独立的小数据集进行分布式处理.本文首先根据RoughSet的特点提出最佳分割的定义,然后提出一种海量数据分割算法来寻找最佳分割.通过实验测试证明结合本文提出的数据分割算法的分布式处理方案能够快速处理海量数据,而且与处理整个数据集的算法相比,正确性较高.

References

[1]  Mehta M, Agrawal R, Rissanen J. SLIQ: A Fast Scalable Classifier for Data Mining. In: Proc of the 5th International Conference on Extending Database Technology. Avignon, France, 1996, 18-32
[2]  Shafer J, Agrawal R, Mehta M. SPRINT: A Scalable Parallel Classifier for Data Mining. In: Proc of the 22nd International Conference on Very Large Databases. Bombay, India, 1996, 544-555
[3]  Prodromidis A L. Management of Intelligent Learning Agents in Distributed Data Mining Systems. Ph.D Dissertation. Department of Computer Science, Columbia University, New York, USA, 1999
[4]  Chan P K W. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. Ph.D Dissertation. Department of Computer Science, Columbia University, New York, USA, 1996
[5]  Prodromidis A, Chan P, Stolfo S. Meta-Learning in Distributed Data Mining Systems: Issues and Approaches. In: Kargupta H, Chan P, eds. Advances in Distributed and Parallel Knowledge Discovery. Cambridge, UK: MIT Press, 2000, 81-114
[6]  Wu X D, Zhang S C. Synthesizing High-Frequency Rules from Different Data Sources. IEEE Trans on Knowledge and Data Engineering, 2003, 15(2): 353-367
[7]  Wang G Y. Rough Set Theory and Knowledge Acquisition. Xi’an, China: Xi’an Jiaotong University Press, 2001 (in Chinese) (王国胤.Rough集理论与知识获取.西安:西安交通大学出版社,2001)
[8]  UCI Machine Learning Repository. 2003. http://www.ics.uci.edu/~mlearn/MLRepository.html
[9]  Wang G Y, Zheng Z, Zhang Y. RIDAS-A Rough Set Based Intelligent Data Analysis System. In: Proc of the 1st International Conference on Machine Learning and Cybernetics. Beijing, China, 1991, 646-649

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133