Conjunt de dades

Labadain-ZSRunS: Sparse and Zero-Shot Dense Retrieval Runs with...

1. Overview Labadain-ZSRunS is a dataset consisting of run files produced by classical sparse and zero-shot dense retrieval models, resulted from the experiments on Tetun ad-hoc...
- ZIP
Interface Element Frequencies in Search Engine Results Pages (SERPs) Across...

This dataset contains the data produced for the dissertation ""User Interface Variations in Search Engine Results Pages Across Types of Search Queries and Search Engines"". The...
- CSV
- HTML
Labadain-Stopwords: A Curated List of 160 Tetun Stopwords

Labadain-Stopwords is a curated list of 160 Tetun stopwords, compiled from the Labadain-30k+ dataset and validated by native speakers. It is well-suited for various Tetun...
- TXT
Labadain-30k+: A Monolingual Tetun Document-Level Audited Dataset

Labadain-30k+ is a monolingual Tetun dataset containing 33,550 documents spanning from June 2001 to September 2023, excluding the years 2004 and 2005, for which no documents are...
- TXT
- PYTHON
Twitter profiles with related topics and websites

This dataset contains two files created for the dissertation "A Social Media Tool for Domain-Specific Information Retrieval - A Case Study in Human Trafficking" by Tito Griné...
- CSV

També podeu accedir a aquest registre usant l'API API (vegeu Documentació de la API).

5 conjunts de dades trobats