Text2Story Lusa

The Text2Story Lusa dataset contains 357 news articles published in European Portuguese by the Lusa news agency mostly between October 2020 and December 2020. The articles are in text format (.txt) and include: publication date, location, headline and content. Also included is a JSON file containing all news articles. This dataset was initially developed in the context of the project "Text2Story: Extracting journalistic narratives from text and representing them in a narrative modeling language" / NORTE-01-0145-FEDER-03185.

To request access to this dataset please fill out this form (Text2Story Lusa - Request Form) and send it to: joao.a.castro@inesctec.pt

If you use this resource, please use the following citations (paper and dataset):

Nunes, S., Jorge, A., Amorim, A., Sousa, H., Leal, A., Silvano, P., Cantante, I.& Campos, R. (2024). Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024).

Nunes, S., Jorge, A., Leal, A., Amorim, E., Sousa, H., Cantante, I., Silvano, P., & Campos, R. (2023). Text2Story Lusa [Data set]. INESC TEC. https://doi.org/10.25747/ET95-BX90

البيانات و الموارد

معلومات إضافية

حقل القيمة
المؤلف Sérgio Nunes, Alípio Jorge, António Leal, Evelin Amorim, Hugo Sousa, Inês Cantante, Purificação Silvano, Ricardo Campos
آخر تحديث أبريل 29, 2024, 13:22 (UTC)
أنشئت مايو 16, 2023, 10:32 (UTC)
Citation Nunes, S., Jorge, A., Leal, A., Amorim, E., Sousa, H., Cantante, I., Silvano, P., & Campos, R. (2023). Text2Story Lusa [Data set]. INESC TEC. https://doi.org/10.25747/ET95-BX90
Contributor Lusa, Agência de Notícias de Portugal, S.A.
Creation Date 2023-01-07
DOI https://doi.org/10.25747/et95-bx90
حجم الملف 400 kb
تنسيق json
اللغة European Portuguese
Temporal Coverage 2020 - 2021
النوع News articles in text format