Dataset Construction Times
This folder fully details the results of the experiments measuring feature computation times. Contains a folder for each of the 6 classes (FA, GA, B, C, Start, Stub), and they are all organized in the same manner. Each experiment measured the feature computation times of 500 random articles of each quality and then calculated their average, to minimize the impact caused by outliers. Therefore, each report will display the time spent, in seconds, of each stage of feature calculation. - FA/GA/B/C/Start/Stub: - titles.txt: Article titles for each experiment - clean_wiki.txt: Time measurements for cleaning wikitext - content.txt: Time measurements for calculating content features - history.txt: Time measurements for calculating history features - readability.txt: Time measurements for calculating readability features - revs.txt: Time measurements for fetching the article's revision history - style.txt: Time measurements for calculating style features - syllables.txt: Time measurements for estimating syllable counts - tokenizer.txt: Time measurements for running the word and sentence tokenizers - wikitext.txt: Time measurements for fetching the article's wikitext - total.txt: Total time measurements
이 리소스를 위해 생성된 뷰가 아직 없습니다.
추가 정보
필드 | 값 |
---|---|
마지막으로 업데이트된 데이터 | 2022년 6월 27일 |
마지막으로 업데이트된 메타데이터 | 2024년 5월 27일 |
생성됨 | 2022년 6월 27일 |
포맷 | ZIP |
라이센스 | Creative Commons Attribution |
Datastore active | False |
Has views | False |
Id | e4827e0c-1e8d-485e-b498-42a78827fcfa |
Package id | 24f17b48-304f-4c07-8f5d-2c9b62e25730 |
Position | 4 |
State | active |
Url type | upload |