Sequence Pattern Mining in Data Streams

H. M. Hijawi, M. H. Saheb

Abstract


Sequential pattern mining in data streams environment is an interesting data mining problem. The problem of finding sequential patterns in static databases had been studied extensively in the past years, however mining sequential patterns in the data streams still an active field for researches. In this research a new greedy sequence pattern mining algorithm for the data streams is introduced, it will be used to find the strongly supported sequences. The proposed algorithm is built based on the sequence tree which is used to find the sequential patterns in static databases. The proposed algorithm divides the streams into patches or windows and each patch will update the sequence tree which built from the previous windows. An example is introduced to explain how this algorithm works. We also show the efficiency and the effectiveness of the proposed algorithm on a synthetic dataset and prove how it is suited for data streams environment. We showed experimentally that the proposed algorithm is more efficient than the PrefixSpan algorithm for patterns with any support less than 30% for CPU time and with any support less than 60% for memory usage.


Full Text:

PDF


DOI: http://dx.doi.org/10.5539/cis.v8n3p64

Computer and Information Science   ISSN 1913-8989 (Print)   ISSN 1913-8997 (Online)
Copyright © Canadian Center of Science and Education

To make sure that you can receive messages from us, please add the 'ccsenet.org' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.