-
Wikipedia and Simple Wikipedia Lead Section Pairs for Nine Categories
The dataset (categorized_dataset folder) contains 9 files in .csv format, each a collection of 10,000 lead section pairs sourced from Wikipedia (https://www.wikipedia.org/) and... -
Metadata and Analysis of Clinical Information Extraction Publications Using...
This dataset contains all the data collected on all the papers analyzed in our publication, entitled "Harnessing Large Language Models for Clinical Information Extraction: A... -
Manual Transcriptions of Typewritten Digital Representations of Portuguese...
The dataset includes manual transcriptions of typewritten digital representations of Portuguese cultural heritage documents from the 20th century, extracted from the Arquivo... -
Matrix profile analysis of Dansgaard-Oeschger events in palaeoclimate time series
This dataset includes all the datafiles and computational notebooks required to reproduce the work reported in the paper “Characterisation of Dansgaard-Oeschger events in... -
Automatic Quality Assessment of Wikipedia Articles - A Systematic Literature...
This is the result dataset related to the article entitled "Automatic Quality Assessment of Wikipedia Articles - A Systematic Literature Review", which is a systematic... -
Images annotated according to their content: a study on the description of...
Data description is a fundamental step in Research Data Management (RDM). When it comes to images, the challenge is increased, as they have characteristics that differentiate... -
Evolution of Web search engine interfaces through SERP screenshots and HTML...
This dataset was extracted for a study on the evolution of Web search engine interfaces since their appearance. The well-known list of “10 blue links” has evolved into richer... -
UrbanSense environmental monitoring
The UrbanSense project is the environmental monitoring part of the Smart City initiative at the city of Porto, Portugal. This dataset contains observational data collected at 23...