SIGARRA News Corpus
Data and Resources
-
sigarra_news_corpus-1000-20170302T1422CSV
Comma-separated file with the following columns: news id, title, subtitle,...
-
sigarra-news-corpusZIP
Annotated news in the standoff format. Each directory represents an organic...
-
sigarra-news-corpusXML
Merged version of the individually annotated news articles, in XML format...
Additional Info
Field | Value |
---|---|
Source | https://sigarra.up.pt |
Author | André Pires |
Last Updated | February 19, 2020, 15:37 (UTC) |
Created | June 13, 2017, 14:35 (UTC) |
DOI | https://doi.org/10.25747/s5jn-q370 |
dc.Contributor | José Devezas, Sérgio Nunes |
dc.Coverage.Spatial | Porto |
dc.Coverage.Temporal | 2016-12-14 to 2017-03-01 |
dc.Date | 2017 |
dc.Format | *.csv; *.xml; *.zip |
dc.Format.Extent | 4,22MB |
dc.Language | PT |
dc.Publisher | INESC TEC |
dc.Relation | Master´s thesis: PIRES, André (2017).Named entity recognition on Portuguese web text. Porto: Faculdade de Engenharia da Universidade do Porto.http://hdl.handle.net/10216/106094 |
dc.Type | Entity Annotated News |