%0 Journal Article %T 支持时序数据聚合函数的索引<br>Time-series data aggregation index %A 黄向东 %A 郑亮帆 %A 邱明明 %A 张金瑞 %A 王建民 %J 清华大学学报(自然科学版) %D 2016 %R 10.16511/j.cnki.qhdxxb.2016.21.032 %X 时序数据是工业新发展的关键, 其中针对时序数据的聚合操作成为主要的应用场景之一。传统关系型数据库不足以支撑海量的时序数据, 而现有的NoSQL数据库对时序数据的聚合操作显得低效耗时。该文提出了一种结合概要表和线段树思想的支持时序数据聚合操作的高效索引机制, 并实现了基于这种索引机制的查询算法。该查询算法将概要表的思想引入NoSQL中, 缩小了待查询数据集, 并通过在概要表上建立概要森林的形式, 将最坏情况下的待查询数据集进一步缩小为索引个数的lbn倍。此外, 该算法通过计算直接定位出待查询的一系列索引数据, 有效避免了一般树形结构的递归遍历操作, 减少了大量的磁盘开销。最后, 通过与一般索引机制的查询对比实验, 验证了该索引机制的可用性和高效性。<br>Abstract:Time-series data is the key to industrial development, with the aggregation of the data an important step in practice. However, traditional relational databases fail to support vast amounts of time-series data. The NoSQL databases are inefficient and require time-consuming calculation to aggregate of time-series data. This paper presents an efficient index mechanism that supports time-series data aggregation by combining a synopsis table and a segment tree. A query algorithm based on this mechanism introduces the synopsis table into the NoSQL database and builds a segment tree from the synopsis table for archiving that is lbn the size of the original query set. This query algorithm can directly locate a series of index data to be queried without the recursive operations in traditional trees and effectively reduces I/O overhead. This study shows the efficiency of this index mechanism by comparisons with general index mechanisms. %K 索引 %K 聚合操作 %K 时序数据 %K 概要表 %K 线段树 %K < %K br> %K index %K aggregate operation %K time-series data %K synopsis table %K segment tree %U http://jst.tsinghuajournals.com/CN/Y2016/V56/I3/229