Construction of suffix tree using key phrases for document using down-top incremental conceptual hierarchical text clustering approach

Balaji Dhashanamoorthi *

Master of Engineering, Control and Instrumentation, CEG, Anna University, Chennai, India.
 
Research Article
International Journal of Science and Research Archive, 2022, 06(01), 294–307.
Article DOI: 10.30574/ijsra.2022.6.1.0143
Publication history: 
Received on 23 May 2022; revised on 26 June 2022; accepted on 29 June 2022
 
Abstract: 
With development of technologies in the World Wide Web, usage of document increases day by day. In order to access the document easily, document clustering technique is introduced. In the field of data mining, document clustering plays a vital role. Organizing the unstructured and unlabeled document is one of the major problems and it is ever growing and complex. Handling of such unorganized documents causes more expensive. Hence, challenges raised by the continuing growth of unstructured and unlabeled documents are handled in this proposed work. Document clustering is one of the most powerful methods to solve the problem of organizing unstructured documents. There are numerous clustering methods available.  In this we were proposed phrase-based clustering algorithm, which is based on the applications of suffix tree document clustering model. The proposed algorithm is designed to use the suffix tree document clustering (STDC) model for accurate representation of document and similarity measurement of the similar documents. This proposed algorithm gives more than 90% accuracy.
 
Keywords: 
Suffix tree document clustering; Retrieval System; Document; Clustering Algorithm; Suffix Tree; Document Clustering
 
Full text article in PDF: