-
SIGARRA News Corpus
This dataset was taken from the SIGARRA information system at the University of Porto (UP). Every organic unit has its own domain and produces academic news. We collected a... -
Hate speech dataset annotated for Portuguese
Portuguese Hate Speech Twitter Dataset is a dataset of Twitter messages manually annotated for Hate Speech using a hierarchical structure of classes. 5,668 messages were... -
HAREM NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the categories, types and subtypes of the Second HAREM dataset as entity classes. -
SIGARRA News Corpus NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the following entity classes: Hora (Hour), Evento (Event), Organizacao (Organization), Curso (Course),...
Prístup do tohto zoznamu je možný aj cez API rozhranie API (viď. dokumentácia API Dokumenty API).