Skip to content

Latest commit

 

History

History
114 lines (81 loc) · 3.69 KB

README.md

File metadata and controls

114 lines (81 loc) · 3.69 KB

Mapping natural language commands to web elements

Data

Due to its large size, the data is hosted outside Github: https://nlp.stanford.edu/projects/phrasenode/

You can download the dataset by running the script as follows.

bash download_dataset.sh

Setup

The code was developed in the following environment:

  • Python 2.7
  • Pytorch 0.4.1
  • CUDA 9

To install dependencies:

  • (Optional) Create a virtualenv / conda environment

    virtualenv.py -p python2.7 env
    source env/bin/activate
    
  • Python dependencies

    sudo apt-get install python-dev
    pip install -r requirements.txt
    

Alternatively, use the docker image ppasupat/phrasenode For latest image: docker pull ppasupat/phrasenode:1.06

Quick start

If you just want to see something happen:

export WEBREP_DATA=./data
./main.py configs/base.txt configs/model/encoding.txt configs/node-embedder/allan.txt -n testrun
  • This executes the main entrypoint script main.py with three config files.
  • The config files are specified in HOCON format. If multiple files are specified, they will be merged together (with later configs overwriting values from previous ones).
  • You can also add ad-hoc config strings using the -c option. These are applied last.
  • -n specifies the experiment directory name.

Configurations

Here are the configerations used in the final experiments:

  • base.txt: Used in all experiments
  • models/encoding.txt: The embedding-based method
  • models/alignment.txt: The alignment-based method
  • node-embedder/allan.txt: The node embedder as described in the paper
  • ablation/*.txt: Ablation

Note that the visual neighbor is off by default. To turn it on, use general/neighbors.txt.

Experiment management

All training runs are managed by the PhraseNodeTrainingRuns object. For example, to get training run #141, do this:

runs = PhraseNodeTrainingRuns()   # Note the final "s"
run = runs[141]  # a PhraseNodeTrainingRun object

A TrainingRun is responsible for constructing a model, training it, saving it and reloading it (see superclasses gtd.ml.TrainingRun and gtd.ml.TorchTrainingRun for details.)

The most important methods on PhraseNodeTrainingRun are:

  • __init__: the model, data storage, etc, are initialized
  • train: actual training of the model happens here

TensorBoard

Statistics are logged to TensorBoard. To view:

tensorboard --logdir=data/experiments

Demo Chrome extension

  • Start the server with

    export WEBREP_DATA=./data
    ./server.py data/experiments/0_testrun/config.txt -m data/experiments/0_testrun/checkpoints/20000.checkpoint/model
    

    where 0_testrun should be changed to the model's directory, and 20000 should be changed to the checkpoint number you want.

  • Install the unpacked Chrome extension in demo/phrasenode-demo

    • Follow the instruction from here to load the extension. (The manifest file should already be in demo/phrasenode-demo)
    • An extension button should now show up on the toolbar
    • On any web page, click on the extension button and enter a phrase in the prompt that pops up.
    • If the server does not throw an error, the selected element should be highlighted with a red border. Details can be viewed in the developer console.

Referenece

Panupong Pasupat, Tian-Shun Jiang, Evan Liu, Kelvin Guu, Percy Liang.
Mapping natural language commands to web elements.
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.

CodaLab: https://worksheets.codalab.org/worksheets/0x0097f249cd944284a81af331093c3579/