Data of real-time prediction of Wikipedia articles' quality

This dataset contains data produced for the dissertation. "Real-time prediction of Wikipedia articles' quality". The project was conducted by student Pedro Miguel Moás (up201705208@edu.fe.up.pt) at FEUP, University of Porto, for the Master in Informatics and Computing Engineering. Our end goal is to provide Wikipedia users with a reliable and transparent tool for automatically assessing quality within Wikipedia. That way, readers will know beforehand if an article is worth reading, while editors may easily detect existing flaws in the articles they encounter. We thus propose creating an extension for the Google Chrome browser that uses machine learning to predict, in real-time, the quality of Wikipedia articles.

The readme file provides the dataset structure.

Data and Resources

Additional Info

Field Value
Author Pedro Miguel Moás, Carla Teixeira Lopes
Last Updated May 27, 2024, 12:44 (UTC)
Created June 27, 2022, 12:53 (UTC)
Citation Moás, P. M., Teixeira Lopes, C. (2022). Data of real-time prediction of Wikipedia articles' quality [Data set]. INESC TEC. https://doi.org/10.25747/2RDD-RC08
DOI https://doi.org/10.25747/2rdd-rc08
dc.Created June 2022
dc.File.Size 2.631 Gb
dc.Type ML Training datasets and reports, diverse extracted Wikipedia information