Initial Release
Pre-release
Pre-release
This initial release covers the following collections:
robust04
- TREC Disks 4&5. You should remove the READMEs from the corpus directories. The configuration uses an external library to support z compressed files.gov2
- the TREC GOV2 corpus. Could also be used for GOV, WT2G, WT10G.cw09b
- the TREC ClueWeb09 corpus.cw12b
- the TREC ClueWeb12 corpus.core18
- the TREC Washington Post (WAPO) corpus. The configuration uses an extra Terrier plugin to support the WAPO format.
It also covers the following hooks: init
, index
, search
, train