Labadain-Avaliadór : A Test Collection for Tetun Ad-hoc Text Retrieval

The Labadain-Avaliadór dataset is a test collection developed for the ad-hoc retrieval task. It comprises 59 topics, 33,550 documents, and 5,900 query-document relevance judgments (qrels), with an average of 36.76 relevant documents per query. The queries are sourced from real-world search activity, specifically from two channels: Google Search Console logs for Timor News and internal search logs from the Timor News website. The document collection is derived from the Labadain-30k+ dataset.

Dada i recursos

Informació addicional

Camp Valor
Autor Gabriel de Jesus, Sérgio Nunes
Última actualització d’abril 21, 2025, 09:03 (UTC)
Creat de març 28, 2025, 10:03 (UTC)
Citation de Jesus, G., & Nunes, S. (2025). Labadain-Avaliadór : A Test Collection for Tetun Ad-hoc Text Retrieval [Data set]. INESC TEC. https://doi.org/10.25747/2K6S-E518
DOI https://doi.org/10.25747/2K6S-E518
Idioma Tetun
Spatial Coverage Timor-Leste