Skip to content

Initial Release

Pre-release
Pre-release
Compare
Choose a tag to compare
@ArthurCamara ArthurCamara released this 13 Jun 12:53
· 59 commits to master since this release

This initial release covers the following collections:

  • robust04 - TREC Disks 4&5. You should remove the READMEs from the corpus directories. The configuration uses an external library to support z compressed files.
  • gov2 - the TREC GOV2 corpus. Could also be used for GOV, WT2G, WT10G.
  • cw09b - the TREC ClueWeb09 corpus.
  • cw12b - the TREC ClueWeb12 corpus.
  • core18 - the TREC Washington Post (WAPO) corpus. The configuration uses an extra Terrier plugin to support the WAPO format.

It also covers the following hooks: init, index, search, train