Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming

  •  K. Duraiswamy    
  •  V. Valli Mayil    


With the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge to the end users. Every request of Web site or a transaction on the server is stored in a file called server log file.  Providing Web administrator with meaningful information about user access behavior (also called click stream data) has become a necessity to improve the quality of Web information and service performance. As such, the hidden knowledge obtained from mining, web server traffic data and user access patterns ( called Web Usage Mining), could be directly  used for marketing and management of E-business, E-services, E-searching , E-education and so on.

Categorizing visitors or users based on their interaction with a web site is a key problem in web usage mining. The click stream generated by various users often follows distinct patterns, clustering  of  the access pattern will provide the  knowledge,  which may help in recommender system of  finding learning pattern of user  in E-learning system , finding group of visitors  with similar interest , providing  customized content in site manager, categorizing  customers in E-shopping etc.

Given session information, this paper focuses a method to find session similarity by sequence alignment using dynamic programming, and proposes a model such as similarity matrix for representing session similarity measures. The work presented in this paper follows Agglomerative Hierarchical Clustering method to cluster the similarity matrix in order to group similar sessions and the clustering process is depicted in dendrogram diagram.

This work is licensed under a Creative Commons Attribution 4.0 License.
  • ISSN(Print): 1913-8989
  • ISSN(Online): 1913-8997
  • Started: 2008
  • Frequency: quarterly

Journal Metrics

WJCI (2020): 0.439

Impact Factor 2020 (by WJCI): 0.247

Google Scholar Citations (March 2022): 6907

Google-based Impact Factor (2021): 0.68

h-index (December 2021): 37

i10-index (December 2021): 172

(Click Here to Learn More)