Text2Story Lusa

The Text2Story Lusa dataset contains 357 news articles published in European Portuguese by the Lusa news agency mostly between October 2020 and December 2020. The articles are in text format (.txt) and include: publication date, location, headline and content. Also included is a JSON file containing all news articles. This dataset was initially developed in the context of the project "Text2Story: Extracting journalistic narratives from text and representing them in a narrative modeling language" / NORTE-01-0145-FEDER-03185.

To request access to this dataset please fill out this form (Text2Story Lusa - Request Form) and send it to: joao.a.castro@inesctec.pt

If you use this resource, please use the following citations (paper and dataset):

Nunes, S., Jorge, A., Amorim, A., Sousa, H., Leal, A., Silvano, P., Cantante, I.& Campos, R. (2024). Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024).

Nunes, S., Jorge, A., Leal, A., Amorim, E., Sousa, H., Cantante, I., Silvano, P., & Campos, R. (2023). Text2Story Lusa [Data set]. INESC TEC. https://doi.org/10.25747/ET95-BX90

Dados e Recursos

Informação Adicional

Campo Valor
Autor Sérgio Nunes, Alípio Jorge, António Leal, Evelin Amorim, Hugo Sousa, Inês Cantante, Purificação Silvano, Ricardo Campos
Última Atualização abril 29, 2024, 13:22 (UTC)
Data de criação maio 16, 2023, 10:32 (UTC)
Citation Nunes, S., Jorge, A., Leal, A., Amorim, E., Sousa, H., Cantante, I., Silvano, P., & Campos, R. (2023). Text2Story Lusa [Data set]. INESC TEC. https://doi.org/10.25747/ET95-BX90
Contributor Lusa, Agência de Notícias de Portugal, S.A.
Creation Date 2023-01-07
DOI https://doi.org/10.25747/et95-bx90
Tamanho do Ficheiro 400 kb
Formato json
Idioma European Portuguese
Temporal Coverage 2020 - 2021
Tipo News articles in text format