Labadain-Stopwords: A Curated List of 160 Tetun Stopwords

Labadain-Stopwords is a curated list of 160 Tetun stopwords, compiled from the Labadain-30k+ dataset and validated by native speakers. It is well-suited for various Tetun information retrieval and natural language processing tasks.The list is distributed in plain text format, with one word per line, enabling easy integration into various projects and applications.

Daten und Ressourcen

Zusätzliche Informationen

Feld Wert
Autor Gabriel de Jesus, Sérgio Nunes
Zuletzt aktualisiert April 21, 2025, 09:04 (UTC)
Erstellt März 28, 2025, 10:03 (UTC)
Citation de Jesus, G., & Nunes, S. (2025). Labadain-Stopwords: A Curated List of 160 Tetun Stopwords [Data set]. INESC TEC. https://doi.org/10.25747/PG2V-KX70
DOI https://doi.org/10.25747/PG2V-KX70
Sprache Tetun
Spatial Coverage Timor-Leste