Impact of Metadata on Full-text Information Retrieval Performance: An Experimental Research on a Small Scale Turkish Corpus

Capkin C.

TURKISH LIBRARIANSHIP, cilt.30, ss.678-701, 2016 (ESCI İndekslerine Giren Dergi) identifier

  • Cilt numarası: 30 Konu: 4
  • Basım Tarihi: 2016
  • Sayfa Sayıları: ss.678-701


Information institutions use text-based information retrieval systems to store, index and retrieve metadata, full-text, or both metadata and full-text (hybrid) contents. The aim of this research was to evaluate impact of these contents on information retrieval performance. For this purpose, metadata (MIR), full-text (FIR) and hybrid (HIR) content information retrieval systems were developed with default Lucene information retrieval model for a small scale Turkish corpus. In order to evaluate performance of this three systems, "precision - recall" and "normalized recall" tests were conducted. Experimental findings showed that there were no significant differences between MIR and FIR in mean average precision (MAP) performance. On the other hand, MAP performance of HIR was significantly higher in comparison to MIR and FIR. When information retrieval performance was evaluated as user-centered, the "normalized recall" performances of MIR and HIR were significantly higher than FIR. Additionally, there were no significant differences between the systems in retrieved relevant document means. Processing different types of contents such as metadata and full-text had some advantages and disadvantages for information retrieval systems in terms of term management. The advantages brought together in hybrid content processing (HIR) and information retrieval performance improved.