For Task 1 - Replicability and Task 2 - Reproducibility we selected the following papers from TREC Common Core Track 2017 and 2018:
- [Grossman et al, 2017] Grossman, M. R., and Cormack, G. V. (2017). MRG_UWaterloo and Waterloo Cormack Participation in the TREC 2017 Common Core Track. In TREC 2017.
- [Benham et al, 2018] Benham, R., Gallagher, L., Mackenzie, J., Liu, B., Lu, X., Scholer, F., Moffat, A., and Culpepper, J. S. (2018). RMIT at the 2018 TREC CORE Track. In TREC 2018.
The following table reports, for each paper, the name of the runs to be replicated and/or reproduced and the datasets and topics to be used for the replicability and reproducibility tasks.
Paper |
Run Name |
Replicability Task |
Reproducibility Task |
[Grossman et al, 2017] |
WCrobust04 and WCrobust0405 |
New York Times Annotated Corpus, with TREC 2017 Common Core Topics |
TREC Washington Post Corpus, with TREC 2018 Common Core Topics |
[Benham et al, 2018] |
RMITFDA4 and RMITEXTGIGADA5 |
TREC Washington Post Corpus, with TREC 2018 Common Core Topics |
New York Times Annotated Corpus, with TREC 2017 Common Core Topics |
CENTRE@CLEF 2019 teams up with OSIRRC for Task 1 and Task 2, thus runs for Task 1 and Task 2 can be submitted both to CENTRE@CLEF 2019 and OSIRRC. For further information, please have a look at submission guidelines.
Task 3 - Generalizability is a new task and will work as follows:
- Training: participants need to run plain BM25 and, if they wish, also their own system on the test collection used for TREC 2004 Robust Track (they are allowed to use the corpus, topics and qrels). Participants need to identify features of the corpus and topics that allow them to predict the system score with respect to Average Precision (AP).
- Validation: participants can use the test collection used for TREC 2017 Common Core Track (corpus, topics and qrels) to validate their method and determine which set of features represent the best choice for predicting AP score for each system. Note that the TREC 2017 Common Core Track topics are an updated version of the TREC 2004 Robust track topics.
- Test (submission): participants need to use the test collection used for TREC 2018 Common Core Track (only corpus and topics). Note that the TREC 2018 Common Core Track topics are a mix of "old" and "new" topics, where old topics were used in TREC 2017 Common Core tracks. Participants will submit a run for each system (BM25 and their own system) and an additional file (one for each system) including the AP score predicted for each topic. The score predicted can be a single value or a value with the corresponding confidence interval.
Corpora:
The New York Times Annotated Corpus contains over 1.8 million articles written and published by the New York Times between January 1, 1987 and June 19, 2007. The text in this corpus is formatted in News Industry Text Format (NITF), which is an XML specification that provides a standardized representation for the content and structure of discrete news articles. The dataset is available here.
The TREC Washington Post Corpus contains 608,180 news articles and blog posts from January 2012 through August 2017. The articles are stored in JSON format, and include title, byline, date of publication, kicker (a section header), article text broken into paragraphs, and links to embedded images and multimedia. The dataset is available here.
The TREC 2004 Robust Corpus corresponds to the set of documents on TREC disks 4 and 5, minus the Congressional Record. This document set contains approximately 528,000 documents. The dataset is available here.
Topics and Qrels: