- Turkish Journal of Electrical Engineering and Computer Science
- Volume:25 Issue:3
- Using latent semantic analysis for automated keyword extraction from large document corpora
Using latent semantic analysis for automated keyword extraction from large document corpora
Authors : Tuğba Önal SÜZEK
Pages : 1784-1794
View : 16 | Download : 11
Publication Date : 0000-00-00
Article Type : Research Paper
Abstract :In this study, we describe a keyword extraction technique that uses latent semantic analysis insert ignore into journalissuearticles values(LSA); to identify semantically important single topic words or keywords. We compare our method against two other automated keyword extractors, Tf-idf insert ignore into journalissuearticles values(term frequency-inverse document frequency); and Metamap, using human-annotated keywords as a reference. Our results suggest that the LSA-based keyword extraction method performs comparably to the other techniques. Therefore, in an incremental update setting, the LSA-based keyword extraction method can be preferably used to extract keywords from text descriptions from big data when compared to existing keyword extraction methods.Keywords : Bioinformatics, text mining, information retrieval