SetRank

Introduction

This repo includes all the benchmark datasets, source code, evaluation toolkit, and experiment results for SetRank framework developed for entity-set-aware literature search.

Data

The ./data/ folder contains two benchmark datasets used for evaluating literature search, namely S2-CS and TREC-BIO.

Model Implementations

The ./code/ folder includes the baseline models and our proposed SetRank framework (including AutoSetRank). The model implementations depend heavily on ElasticSearch 5.4.0 which is an open-sourced search engine for indexing and performing full-text search. Furthermore, you need to install the following python packages by typing the command:

$ pip3 install -r requirements.txt

After creating the index, you can perform the search using following commands:

$ cd ./code/SetRank
$ python3 setRank_TREC.py -query ../../data/S2-CS/s2_query.json -output ../../results/s2/setRank.run

The results will then be saved in "../../results/s2/setRank.run".

Evaluation Tool

The ./pytrec_eval/ folder includes the original evaluation toolkit pytrec_eval and our customized scripts for performing model evaluation.

You may first follow the instructions in ./pytrec_eval/README.md to install this packages and then conduct the model evaluation using following commands:

$ cd ./pytrec_eval/examples
$ ./eval.sh ../../results/s2/setRank.run setRank ## first argument is path to run file and the result save files

Experiment Results

The ./results/ folder includes all the experiment results reported in our paper. Specifically, each file with suffix .run is the model output ranking files; each file with suffix .eval.tsv is the query-specific evaluation result. Notice that in the paper, we only report the NDCG@{5,10,15,20}, while here we releases the experiment results in terms of other metrics, including MAP@{5,10,15,20} and success@{1,5,10}.

Citation

If you use the datasets or model implementation code produced in this paper, please refer to our SIGIR paper:

@inproceedings{JiamingShen2018ess,
  title={Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach},
  author={Jiaming Shen, Jinfeng Xiao, Xinwei He, Jingbo Shang, Saurabh Sinha, and Jiawei Han},
  publisher={ACM},
  booktitle={SIGIR},
  year={2018},
}

Furthermore, if you use the pytrec_eval toolkit, please also consider citing the original paper:

@inproceedings{VanGysel2018pytreceval,
  title={Pytrec\_eval: An Extremely Fast Python Interface to trec\_eval},
  author={Van Gysel, Christophe and de Rijke, Maarten},
  publisher={ACM},
  booktitle={SIGIR},
  year={2018},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SetRank

Introduction

Data

Model Implementations

Evaluation Tool

Experiment Results

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
code		code
data		data
pytrec_eval		pytrec_eval
results		results
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

mickeysjm/SetRank

Folders and files

Latest commit

History

Repository files navigation

SetRank

Introduction

Data

Model Implementations

Evaluation Tool

Experiment Results

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages