|
计算机科学 2004
Analysis of Browsing Behaviour in Web Log Based on Time Density
|
Abstract:
Facing the threshold of session recognize in Web log mining, a frequency analysis method based on time interval is introduced. First, the visitor frequency of user based on scale parameter of time interval is defined as a random vector. The cut-tail algorithm for random vector is also given. Second, a frequency-user IP relevant matrix is set up, where frequency is taken as row and user IP is taken as column, and each element's value of this matrix is the user's visitor frequency on the time interval. The different IP users are classified by measuring similarity between column vectors and the browsing behaviour is also discussed. Finally, the parametric estimation and test of threshold of session recognize are given by further sampling-