%0 Journal Article
%T An Application of Improved Gap-BIDE Algorithm for Discovering Access Patterns
%A Xiuming Yu
%A Meijing Li
%A Taewook Kim
%A Seon-phil Jeong
%A Keun Ho Ryu
%J Applied Computational Intelligence and Soft Computing
%D 2012
%I Hindawi Publishing Corporation
%R 10.1155/2012/593147
%X Discovering access patterns from web log data is a typical sequential pattern mining application, and a lot of access pattern mining algorithms have been proposed. In this paper, we propose an improved approach of Gap-BIDE algorithm to extract user access patterns from web log data. Compared with the previous Gap-BIDE algorithm, a process of getting a large event set is proposed in the provided algorithm; the proposed approach can find out the frequent events by discarding the infrequent events which do not occur continuously in an accessing time before generating candidate patterns. In the experiment, we compare the previous access pattern mining algorithm with the proposed one, which shows that our approach is very efficient in discovering access patterns in large database. 1. Introduction The web has become an important channel for conducting business transactions and e-commerce. Also, it provides a convenient means for us to communicate with each other worldwide. With the rapid development of web technology, the web has become an important and preferred platform for distributing and acquiring information. The data collected automatically by the web and application web servers represent the navigational behavior of web users, and such data is called web log data. Web mining is a technology to discover and extract useful information from web log data. Because of the tremendous growth of information sources, increasing interest of various research communities, and the recent interest in e-commerce, the area of web mining has become vast and more interesting. It deals with data related to the web, such as data hidden in web contents, data presented on web pages, and data stored on web servers. Based on the kinds of data, there are three categories of web mining: web content mining, web structure mining, and web usage mining [1]. The Web usage data includes the data from web server access logs, proxy server logs, and browser logs. It is also known as web access patterns. Web usage mining tries to discover the access patterns from web log files. Web access tracking can be defined as web page history [2]; the mining task is a process of extracting interesting patterns in web access logs. There are so many techniques of mining web usage data including statistical analysis [3], association rules [4], sequential patterns [5每7], classification [8每10], and clustering [11每13]. Access pattern mining is a popular approach of sequential pattern mining, which extracts frequent subsequences from a sequence database [14]. Further, discovering access patterns is an
%U http://www.hindawi.com/journals/acisc/2012/593147/