全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

基于不等长序列相似度挖掘的数据关联算法

DOI: 10.13195/j.kzyjc.2014.0388, PP. 1033-1038

Keywords: 数据关联,序列相似度,不等长度,滑动窗口,最优匹配增权

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对不等长序列数据的关联问题,提出基于滑动窗口的最优匹配增权法不等长序列相似度度量算法.以较短序列作为滑动窗口遍历较长序列得到一组滑动相似度,利用这组相似度形成最优权重,加权得到不等长序列的相似度,并根据相似度大小对序列数据进行关联判决,以解决截断法相似度度量仅能反映截断序列局部相似度的问题.仿真实验验证了所提出算法对不等长序列数据关联的有效性,并对序列长度和量测误差等因素对相似度度量和关联效果的影响进行了讨论.

References

[1]  Agrawal R, Faloutsos C, Swami A. Efficient similarity search in sequence databases[J]. Lecture Notes in Computer Science, 1993, 730: 69-84.
[2]  Agrawal R, Sreenivas Gollapudi, Anitha Kannan, et al. Data mining for improving textbooks[J]. ACM SIGKDD Explorations Newsletter, 2012, 13(2): 7-19.
[3]  Agrawal R, Amit Somani. System and method for distributed querying and presentation of information from heterogeneous data sources[P]. US: 7702617. 2010-04-20.
[4]  Faloutsos C, Ranganathan M, Manolopoulos Y. Fast subsequence matching in time-series databases[J]. ACM SIGMOD Record, 1994, 23(2): 419-429.
[5]  Seung-woo Kim, Sanghyun Park, Jung-im Won, et al. Privacy preserving data mining of sequential patterns for network traffic data[J]. Information Sciences, 2008, 178(3): 694-713.
[6]  Sang-wook Kim, Sanghyun Park, Wealey W Chu. An
[7]  index-based approach for similarity search supporting time warping in large sequence databases[C]. Proc of the 17th Int Conf on Data Engineering. Heidelberg: IEEE Computer Society Press, 2001: 607-614.
[8]  Gustavo E, Batista, Eamonn J, et al. CID: An efficient complexity-invariant distance for time series[J]. Data Mining and Knowledge Discovery, 2014, 28(3): 634-669.
[9]  Thanawin Rakthanmanon, Bilson Campana, Keogh E. Searching and mining trillions of time series subsequences under dynamic time warping[C]. Proc of the 18th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. Beijing: ACM Press, 2012: 262-270.
[10]  Keogh E. How to do good data mining research and get it published in top venues[C]. 2010 IEEE Int Conf on Data Mining. Sydney: IEEE Computer Society Press, 2010: 1219.
[11]  Keogh E, Chakrabarti K, Pazzani M, et al. Dimensionality reduction for fast similarity search in large time series databases[J]. J of Knowledge and Information Systems, 2001, 3(3): 263-286.
[12]  Xiang Lian, Lei Chen. Efficient similarity search over future stream time series[J]. IEEE Trans on Knowledge and Data Engineering, 2008, 20(1): 40-54.
[13]  Luou Shen, Chenxi Lu, Fang Zhao, et al. Discrete fourier transformation for seasonal-factor pattern classification and assignment[J]. IEEE Trans on Intelligent Transportation Systems, 2013, 14(2): 511-516.
[14]  Saari P, Eerola T. Semantic computing of moods based on tags in social media of music[J]. IEEE Trans on Knowledge and Data Engineering, 2013, 26(10): 2548-2560.
[15]  Tang Jie, Zhang Jun, Geng Xinyu, et al. SVD
[16]  based factorization technique for dual privacy protection data mining[C]. 2011 Int Conf on Computational and Information Sciences. Chengdu: IEEE Computer Society Press, 2011: 357-360.
[17]  Radovic M, Djokovic M, Peulic A, et al. Application of data mining algorithms for mammogram classification[C]. 2013 IEEE Int Conf on Bioinformatics and Bioengineering. Chania: IEEE Computer Society Press, 2013: 1-4.
[18]  Peng Zhu, Ming-sheng Zhao, Tian-chi He. A DWT based time series outlier data mining algorithm[C]. 2010 Int Conf on Electronics and Information Engineering. Kyoto: IEEE Computer Society Press, 2010: 239-241.
[19]  张海勤, 蔡庆生. 基于小波变换的时间序列相似模式匹配[J]. 计算机学报, 2003, 26(3): 373-377.
[20]  (Zhang H Q, Cai Q S. Time series similarity Querying based on wavelets[J]. J of Computer, 2003, 26(3): 373-377.)
[21]  李海林, 郭崇慧. 基于云模型的时间序列分段聚合近似方法[J]. 控制与决策, 2011, 26(10): 373-377.
[22]  (Li H L, Guo C H. Piecewise aggregate approximation method based on cloud model for time series[J]. Control and Decision, 2011, 26(10): 373-377.)
[23]  Chonghui Guo, Hailin Li, Donghua Pan. An improved piecewise aggregate approximation based on statistical features for time series mining[C]. Proc of the 4th Int Conf KSEM. Belfast: Springer Berlin Heidelberg Press, 2010: 234-244.
[24]  Armita Karachi, Mohammad G Dezfuli, Mostafa S, et al. PLR: A benchmark for probabilistic data stream management systems[C]. The 4th Asian Conf on Intelligent Information and Database Systems. Taiwan: Springer Berlin Heidelberg Press, 2012: 405-415.
[25]  Yuelong Zhu, De Wu, Shijin Li. A piecewise linear representation method of time series based on feature points[C]. KES 2007, XVII Italian Workshop on Neural Networks. Vietri sul Mare: Springer Berlin Heidelberg Press, 2007: 12-14.
[26]  Wenwei Xue, Qiong Luo, Hejun Wu. Pattern-based event detection in sensor networks[J]. Distributed and Parallel Databases, 2012, 30(1): 27-62.
[27]  Guanglei Wu, Shaoping Bai, Kepler J A. Error modelling and experimental validation for a planar 3-PPR parallel manipulator[C]. 2011 Int Conf on Advanced Robotics. Tallinn: Springer Berlin Heidelberg Press, 2011: 259-264.
[28]  Rong Tong, Bin Ma, Haizhou Li, et al. A target-oriented phonotactic front-end for spoken language recognition[J]. IEEE Trans on Audio, Speech, and Language Processing, 2009, 17(7): 1335-1347.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133