-
Interface Element Frequencies in Search Engine Results Pages (SERPs) Across...
This dataset contains the data produced for the dissertation ""User Interface Variations in Search Engine Results Pages Across Types of Search Queries and Search Engines"". The... -
Labadain-ZSRunS: Sparse and Zero-Shot Dense Retrieval Runs with...
1. Overview Labadain-ZSRunS is a dataset consisting of run files produced by classical sparse and zero-shot dense retrieval models, resulted from the experiments on Tetun ad-hoc... -
Labadain-Stopwords: A Curated List of 160 Tetun Stopwords
Labadain-Stopwords is a curated list of 160 Tetun stopwords, compiled from the Labadain-30k+ dataset and validated by native speakers. It is well-suited for various Tetun... -
Labadain-30k+: A Monolingual Tetun Document-Level Audited Dataset
Labadain-30k+ is a monolingual Tetun dataset containing 33,550 documents spanning from June 2001 to September 2023, excluding the years 2004 and 2005, for which no documents are... -
Twitter profiles with related topics and websites
This dataset contains two files created for the dissertation "A Social Media Tool for Domain-Specific Information Retrieval - A Case Study in Human Trafficking" by Tito Griné...
Prístup do tohto zoznamu je možný aj cez API rozhranie API (viď. dokumentácia API Dokumenty API).