|
- 2017
一种面向不确定数据流的模体发现算法
|
Abstract:
借鉴生物信息学中序列模式发现思想,提出了基于MEME(multiple expectation-maximization for motif elicitation)的不确定数据流模体发现算法。该算法根据不确定数据流的特点,设计了不确定滑动窗口的简化计算方法,改进了SAX(symbolic aggregate approximation)的符号化策略,用防空反导情报传感器网络中的一组不确定数据流验证了其可行性,通过植入不同数目模体的方法测试了其准确性,并在元组存在概率为1的条件下与已有算法进行比较,验证其有效性。
[1] | PUNEET A, GAUTAM S, SARMIMALA S, et al. Efficiently discovering frequent motifs in large-scale sensor data[EB/OL].[2015-06-30]. https://www.researchgate.net/publication/270454309_Efficiently_Discovering_Frequent_Motifs_in_Large-scale_Sensor_Data. |
[2] | LIN J, KENOGH E J, WEI D L, et al. Experiencing SAX:a novel symbolic representation of time series[J]. Data Min Knowl Disc, 2007, 15:107-144. |
[3] | ABDULLAH M, NIKAN C. Enumeration of time series motifs of all lengths[J]. Knowl Inf Syst, 2015, 45:105-132. |
[4] | 李明, 张维明. 不确定数据流多维建模方法[J]. 国防科技大学学报, 2014, 36(5):174-179. LI Ming, ZHANG Wei-ming. Multi-dimensional modeling method of uncertain data stream[J]. Journal of the National Defense University, 2014, 36(5):174-179. |
[5] | MUEEN A, KEOGH E J, ZHU Q, et al. Exact discovery of time series motif[C]//Society for Industrial and Applied Mathematics Conf. on Data Mining.[S.l.]:Springer, 2009. |
[6] | 邹力鹍, 张其善. 基于多最小支持度的加权关联规则挖掘算法[J]. 北京航空航天大学学报, 2007, 33(5):590-593. ZOU Li-pu, ZHANG Qi-shan. Algorithm of weighted association rules mining with multiple minimum supports[J]. Beijing University of Aeronautics and Astronautics Technology, 2007, 33(5):590-593. |
[7] | 张懿璞, 霍红卫, 于强, 等. 用于转录因子结合位点识别的定位投影求精算法[J]. 计算机学报, 2013, 36(12):2545-2559. ZHANG Yi-pu, HUO Hong-wei, YU Qiang, et al. A novel fixed-position projection refinement algorithm for TFBS Identification[J]. Journal of Computers, 2013, 36(12):2545-2559. |
[8] | TIMOTHY L B. Dreme:Motif discovery in transcription factor ChIP-seq data[J]. Original Paper, 2011, 17(12):1653-1659. |
[9] | DANIEL Q, XIE X H. Extreme:an online EM algorithm for motif discovery[J]. Original Paper, 2014, 30(12):1667-1673. |
[10] | 梁春泉. 不确定数据流分类算法研究[D]. 西安:西北农林科技大学, 2014. LIANG Chun-quan. Classification algorithm based on uncertain data stream[D]. Xi'an:Northwest Agriculture and Forestry University, 2014. |
[11] | THANH T L T, PENG L P, DIAO Y L, et al. CLARO:Modeling and processing uncertain data streams[J]. VLDB Journal, 2012, 21:651-676. |
[12] | JIN C Q, JEFFREY X Y, ZHOU A Y, et al. Efficient clustering of uncertain data streams[J]. Knowl Inf Syst, 2014, 40:509-539. |
[13] | 朱跃龙, 彭力, 李士进, 等. 水文时间序列模体挖掘[J]. 水利学报, 2012, 43(12):1422-1430. ZHU Yue-long, PENG Li, LI Shi-jin, et al. Research on hydrological time series mining[J]. Hydraulic Engineering, 2012, 43(12):1422-1430. |
[14] | MICHELE D. Modeling and querying data series and data streams with uncertainty[D]. The Autonomous Province of Trento:Universita` degli Studi di Trento, 2014, |
[15] | HONG Y. On computing the distribution function for the sum of independent and non-identical random indicators[EB/OL].[2015-10-10]. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.220.8708. |
[16] | 曲文龙, 张克君, 杨炳儒, 等. 基于奇异事件特征聚类的时间序列符号化方法[J]. 系统工程与电子技术, 2006, 28(8):1131-1134. QU Wen-long, ZHANG Ke-jun, YANG Bing-ru, et al. Time series symbolization based on singular event feature clustering[J]. Systems Engineering and Electronics, 2006, 28(8):1131-1134. |