Labadain-Avaliadór : A Test Collection for Tetun Ad-hoc Text Retrieval Task

The Labadain-Avaliadór dataset is a test collection developed for the ad-hoc retrieval task. It comprises 59 topics, 33,550 documents, and 5,900 query-document relevance judgments (qrels), with an average of 36.76 relevant documents per query. The queries are sourced from real-world search activity, specifically from two channels: Google Search Console logs for Timor News and internal search logs from the Timor News website. The document collection is derived from the Labadain-30k+ dataset.

데이터와 리소스

추가 정보

필드
저자 Gabriel de Jesus, Sérgio Nunes
최종 업데이트 4월 17, 2025, 10:09 (UTC)
생성됨 3월 28, 2025, 10:03 (UTC)
Citation de Jesus, G., & Nunes, S. (2025). Labadain-Avaliadór : A Test Collection for Tetun Ad-hoc Text Retrieval Task [Data set]. INESC TEC. https://doi.org/10.25747/2K6S-E518
DOI https://doi.org/10.25747/2K6S-E518
언어 Tetun
Spatial Coverage Timor-Leste