-
SIGARRA News Corpus
This dataset was taken from the SIGARRA information system at the University of Porto (UP). Every organic unit has its own domain and produces academic news. We collected a... -
Hate speech dataset annotated for Portuguese
Portuguese Hate Speech Twitter Dataset is a dataset of Twitter messages manually annotated for Hate Speech using a hierarchical structure of classes. 5,668 messages were... -
HAREM NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the categories, types and subtypes of the Second HAREM dataset as entity classes. -
SIGARRA News Corpus NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the following entity classes: Hora (Hour), Evento (Event), Organizacao (Organization), Curso (Course),...