-
Semantic representation of the Registos de Baptismos da Paróquia de Aldoar...
This dataset comprises mappings of archival records from the National Archives of Portugal to the RiC-O (Records in Contexts Ontology) framework, namely the baptism registries... -
Wikipedia and Simple Wikipedia Lead Section Pairs for Nine Categories
The dataset (categorized_dataset folder) contains 9 files in .csv format, each a collection of 10,000 lead section pairs sourced from Wikipedia (https://www.wikipedia.org/) and... -
Metadata and Analysis of Clinical Information Extraction Publications Using...
This dataset contains all the data collected on all the papers analyzed in our publication, entitled "Harnessing Large Language Models for Clinical Information Extraction: A... -
Immersive Learning Thematic Network Data
Information for a database where practices, strategies and uses of Immersive Learning are connected with works in the field. These Practices, Strategies and Uses were found... -
Matrix profile analysis of Dansgaard-Oeschger events in palaeoclimate time series
This dataset includes all the datafiles and computational notebooks required to reproduce the work reported in the paper “Characterisation of Dansgaard-Oeschger events in... -
Tribunal do Santo Ofício in ArchOnto - Extension of archival records through...
This dataset comprises mappings of representations of archive records in ArchOnto, DBpedia, and Wikidata. These manual representations demonstrate how archive entities are... -
Consensual ArchOnto representation of 13 Portuguese Historical Archival...
The dataset contains archival descriptions represented in the ArchOnto model (https://rdm.inesctec.pt/dataset/cs-2022-004) of 13 records from the 20th century with typewritten... -
ArchOnto - January, 2022
A linked data model for archives composed of five different ontologies - CIDOC CRM (base ontology developed in the museums context), DataObject, N-ary (based in CIDOC... -
Images annotated according to their content: a study on the description of...
Data description is a fundamental step in Research Data Management (RDM). When it comes to images, the challenge is increased, as they have characteristics that differentiate... -
Evolution of Web search engine interfaces through SERP screenshots and HTML...
This dataset was extracted for a study on the evolution of Web search engine interfaces since their appearance. The well-known list of “10 blue links” has evolved into richer... -
Wikipedia information quality comparison between idioms
Source code and dataset from the first part of my Master Dissertation - "Avaliação da qualidade da Wikipédia enquanto fonte de informação em saúde" (Wikipedia quality assessment... -
SIGARRA News Corpus
This dataset was taken from the SIGARRA information system at the University of Porto (UP). Every organic unit has its own domain and produces academic news. We collected a... -
UrbanSense environmental monitoring
The UrbanSense project is the environmental monitoring part of the Smart City initiative at the city of Porto, Portugal. This dataset contains observational data collected at 23... -
HAREM NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the categories, types and subtypes of the Second HAREM dataset as entity classes. -
SIGARRA News Corpus NER Models for OpenNLP, Stanford CoreNLP, spaCy, NLTK
Pre-trained models for named entity recognition in Portuguese, using the following entity classes: Hora (Hour), Evento (Event), Organizacao (Organization), Curso (Course),...
E' possibile inoltre accedere al registro usando le API (vedi Documentazione API).