IAD Index of Academic Documents
  • Home Page
  • About
    • About Izmir Academy Association
    • About IAD Index
    • IAD Team
    • IAD Logos and Links
    • Policies
    • Contact
  • Submit A Journal
  • Submit A Conference
  • Submit Paper/Book
    • Submit a Preprint
    • Submit a Book
  • Contact
  • Turkish Journal of Electrical Engineering and Computer Science
  • Volume:24 Issue:5
  • A new dictionary-based preprocessor that uses radix-190 numbering

A new dictionary-based preprocessor that uses radix-190 numbering

Authors : METE ERAY ŞENERGİN, ERHAN ALİRİZA İNCE
Pages : 4465-4480
View : 18 | Download : 8
Publication Date : 0000-00-00
Article Type : Research Paper
Abstract :Various scholarly works in the literature have pointed out that placing a preprocessor in front of a standard postcompressor would help achieve higher gains while compressing natural-language text files. Ever since, there has been much research on preprocessors to improve the gain attained by concatenated systems. With the same goal in mind our paper proposes a new word-based preprocessor named METEHAN190 insert ignore into journalissuearticles values(M190); and contrasts its performance with four other state-of-the-art preprocessors. Throughout the experiments source files from the Wall Street Journal insert ignore into journalissuearticles values(WSJ); archive, and the Calgary, Canterbury, Gutenberg, and Pizza and Chili corpora were used. Postcompressors adapted were Prediction by Partial Matching compressor using method-D insert ignore into journalissuearticles values(PPMD); and Monstrous PPM II compressor insert ignore into journalissuearticles values(PPMonstr);. It was observed that in all three experiments WRT and M190 would achieve the two highest compression gains. For small text and transcription files from the Calgary corpus, M190 would outperform all preprocessors including WRT. On the other hand, a look at average encoding and decoding times shows that the semistatic byte-oriented methods are much faster in comparison to the static dictionary-based methods that encode words with characters.
Keywords : Lossless text compression, preprocessing, postcompressor, dictionary, semistatic byte oriented preprocessors, METEHAN 190

ORIGINAL ARTICLE URL
VIEW PAPER (PDF)

* There may have been changes in the journal, article,conference, book, preprint etc. informations. Therefore, it would be appropriate to follow the information on the official page of the source. The information here is shared for informational purposes. IAD is not responsible for incorrect or missing information.


Index of Academic Documents
İzmir Academy Association
CopyRight © 2023-2025