-
Wikipedia and Simple Wikipedia Lead Section Pairs for Nine Categories
The dataset (categorized_dataset folder) contains 9 files in .csv format, each a collection of 10,000 lead section pairs sourced from Wikipedia (https://www.wikipedia.org/) and... -
Metadata and Analysis of Clinical Information Extraction Publications Using...
This dataset contains all the data collected on all the papers analyzed in our publication, entitled "Harnessing Large Language Models for Clinical Information Extraction: A... -
Typewritten Digital Representations of Portuguese Cultural Heritage...
The dataset has typewritten Portuguese documents extracted from the Arquivo Nacional da Torre do Tombo (https://digitarq.arquivos.pt/). It includes records from two fonds of the... -
Manual Transcriptions of Typewritten Digital Representations of Portuguese...
The dataset includes manual transcriptions of typewritten digital representations of Portuguese cultural heritage documents from the 20th century, extracted from the Arquivo... -
Immersive Learning Thematic Network Data
Information for a database where practices, strategies and uses of Immersive Learning are connected with works in the field. These Practices, Strategies and Uses were found... -
Matrix profile analysis of Dansgaard-Oeschger events in palaeoclimate time series
This dataset includes all the datafiles and computational notebooks required to reproduce the work reported in the paper “Characterisation of Dansgaard-Oeschger events in... -
Automatic Quality Assessment of Wikipedia Articles - A Systematic Literature...
This is the result dataset related to the article entitled "Automatic Quality Assessment of Wikipedia Articles - A Systematic Literature Review", which is a systematic... -
Wikipedia information quality comparison between idioms
Source code and dataset from the first part of my Master Dissertation - "Avaliação da qualidade da Wikipédia enquanto fonte de informação em saúde" (Wikipedia quality assessment... -
Solar power forecasting: measurements and numerical weather predictions
This dataset contains hourly power measurements from a 16.32 kW peak PV power plant located on the North of Portugal (Smart Grid and Electric Vehicle Lab – SGEVL, at INESC TEC),... -
SIGARRA News Corpus
This dataset was taken from the SIGARRA information system at the University of Porto (UP). Every organic unit has its own domain and produces academic news. We collected a...