Semantic Clustering for Large-Scale Documents.doc
- Ming Liu
Abstract
Along with explosion of information, how to cluster large-scale documents has become more and more important. This paper proposes a novel document clustering algorithm (CLCL) to solve this problem. This algorithm first constructs lexical chains from feature space to reflect different topics which input documents contain, and documents also can be separated into clusters by these lexical chains. However, this separation is too rough. So, idea of self organizing mapping is used to optimize cluster partition. For agglomerating documents with semantic similarities into one cluster, influences from similar features are also considered. Experiments demonstrate that because effects of semantic similarities between different documents are considered, CLCL has better performance than traditional document clustering algorithms.- Full Text:
PDF
- DOI:10.5539/cis.v3n1p91
Journal Metrics
WJCI (2020): 0.439
Impact Factor 2020 (by WJCI): 0.247
Google Scholar Citations (March 2022): 6907
Google-based Impact Factor (2021): 0.68
h-index (December 2021): 37
i10-index (December 2021): 172
(Click Here to Learn More)
Index
- Academic Journals Database
- BASE (Bielefeld Academic Search Engine)
- CiteFactor
- CNKI Scholar
- COPAC
- CrossRef
- DBLP (2008-2019)
- EBSCOhost
- EuroPub Database
- Excellence in Research for Australia (ERA)
- Genamics JournalSeek
- Google Scholar
- Harvard Library
- Infotrieve
- LOCKSS
- Mendeley
- PKP Open Archives Harvester
- Publons
- ResearchGate
- Scilit
- SHERPA/RoMEO
- Standard Periodical Directory
- The Index of Information Systems Journals
- The Keepers Registry
- UCR Library
- Universe Digital Library
- WJCI Report
- WorldCat
Contact
- Chris LeeEditorial Assistant
- cis@ccsenet.org