diff --git a/README.md b/README.md index 5e33e3c..ae1b49c 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,15 @@ This implementation refers to the project structure of [mulrel-nel](https://gith ## Data Download [data](https://drive.google.com/file/d/1xW-t80cKDMx3ZL-hrRUxlm6QMZIRvUyU/view) from here and unzip to the main folder (i.e. your-path/DCA). +This data archive contains the following resource files: + +- **Dataset**: One in-domain dataset (AIDA-CoNLL) and Five cross-domain datasets (MSNBC/AQUAINT/ACE2004/CWEB/WIKI). These datasets share the same data format. + +- **Type Embedding**: Used to compute type similarity between mention-entity pairs. These type embedding are trained by a typing system called [NFETC](https://arxiv.org/abs/1803.03378) model. + +- **Wikipedia inLinks and outLinks**: Surface names of inlinks and outlinks for a Wikipedia page (entity) are used to construct **dynamic context**. + + ## Installation Requirements: Python 3.5 or 3.6, Pytorch 0.3, CUDA 7.5 or 8