-
ArchOnto - January, 2022
A linked data model for archives composed of five different ontologies - CIDOC CRM (base ontology developed in the museums context), DataObject, N-ary (based in CIDOC... -
Typewritten Digital Representations of Portuguese Cultural Heritage Documents...
The dataset has typewritten Portuguese documents extracted from the Arquivo Nacional da Torre do Tombo (https://digitarq.arquivos.pt/). It includes records from two fonds of the... -
Evolution of Web search engine interfaces through SERP screenshots and HTML c...
This dataset was extracted for a study on the evolution of Web search engine interfaces since their appearance. The well-known list of “10 blue links” has evolved into richer... -
Manual Transcriptions of Typewritten Digital Representations of Portuguese Cu...
The dataset includes manual transcriptions of typewritten digital representations of Portuguese cultural heritage documents from the 20th century, extracted from the Arquivo... -
Immersive Learning Thematic Network Data
Information for a database where practices, strategies and uses of Immersive Learning are connected with works in the field. These Practices, Strategies and Uses were found... -
Methods and Tools for Causal Discovery and Causal Inference
Nowadays ML models are used in decision-making processes in real-world problems, by learning a function that maps the observed features with the decision outcomes. However these... -
Wikipedia information quality comparison between idioms
Source code and dataset from the first part of my Master Dissertation - "Avaliação da qualidade da Wikipédia enquanto fonte de informação em saúde" (Wikipedia quality assessment... -
Images annotated according to their content: a study on the description of da...
In research data management, data description is a key task so that all datasets that exist in different projects are properly interpreted and understood. However, in many... -
SIGARRA News Corpus
This dataset was taken from the SIGARRA information system at the University of Porto (UP). Every organic unit has its own domain and produces academic news. We collected a... -
UrbanSense environmental monitoring
The UrbanSense project is the environmental monitoring part of the Smart City initiative at the city of Porto, Portugal. This dataset contains observational data collected at 23... -
HAREM NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the categories, types and subtypes of the Second HAREM dataset as entity classes. -
SIGARRA News Corpus NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the following entity classes: Hora (Hour), Evento (Event), Organizacao (Organization), Curso (Course),...