The dialogue bot is based on [1] which introduces Hybrid Code Networks (HCNs) that combine an RNN with domain-specific knowledge and system action templates.
Diagram 1. Dotted lines correspond to unrequired (optional) modules, black squares to trained models, trapezes are modules that depend on a dataset and must be provided by software developer.
Here is a simple example of interaction with a trained dialogue bot (can be downloaded with deeppavlov/download.py
):
::
>> Hello, welcome to the Phystech restaurant system. You can ask for restaurants by area, price range or food type. How may I help you?
:: hi, i want some food
>> What part of town do you have in mind?
:: north of Dolgoprudny
>> Would you like something in the cheap, moderate, or expensive price range?
:: something chep, please
>> "Time to eat" is a nice place in the north of town and the prices are cheap.
:: what kind of food does it serve?
>> "Time to eat" serves russian food.
:: i like that! give me their address then
>> Sure, "Time to eat" is on Sobina Square, 1/4.
:: and phone number. In case i wanted anything for delivery.
>> The phone number of "Time to eat" is +7(965)173-37-33.
:: I won't need that, but give me their post code too.
>> The post code of "Time to eat" is 141700.
:: nice, thanks!
>> "Time to eat" is a nice place in the centre of town serving tasty creative food.
:: good bye
>> You are welcome!
:: exit
To use a go_bot model you should have:
- (optional, but recommended) pretrained named entity recognition model (NER)
- config
deeppavlov/configs/ner/ner_dstc2.json
is recommended
- config
- (optional, but recommended) pretrained intents classifier model
- config
deeppavlov/configs/intents/intents_dstc2.json
is recommended
- config
- (optional) downloaded english fasttext embeddings trained on wiki (https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.en.zip)
- fasttext embeddings can loaded via
python3 deeppavlov/download.py --all
- you can use any english embeddings of your choice, but edit go_bot config accordingly
- fasttext embeddings can loaded via
- pretrained vocabulary of dataset utterance tokens
- it can be trained alongside go_bot model
- pretrained goal-oriented bot model itself
- config
deeppavlov/configs/go_bot/gobot_dstc2.json
is recommended slot_filler
section of go_bot's config should match NER's configurationintent_classifier
section of go_bot's config should match classifier's configuration- double-check that corresponding
load_path
s point to NER and intent classifier model files
- config
name
always equals to"go_bot"
template_path
— map from actions to text templates for response generationuse_action_mask
— in case of true, action mask is applied to network outputword_vocab
— vocabulary of tokens from context utterancesname
—"default_vocab"
(for vocabulary's implementation seedeeppavlov.core.data.vocab
)level
—"token"
,tokenize
—true
,save_path
—"vocabs/token.dict"
load_path
—"vocabs/token.dict"
tokenizer
— one of tokenizers fromdeeppavlov.models.tokenizers
modulename
— tokenizer name- other arguments specific to your tokenizer
bow_encoder
— one of bag-of-words encoders fromdeeppavlov.models.encoders.bow
modulename
— encoder name- other arguments specific to your encoder
embedder
— one of embedders fromdeeppavlov.models.embedders
modulename
— embedder name ("fasttext"
recommended, seedeeppavlov.models.embedders.fasttext_embedder
)mean
— must be set totrue
- other arguments specific to your embedder
tracker
— dialogue state tracker fromdeeppavlov.models.trackers
name
— tracker name ("default_tracker"
or"featurized_tracker"
recommended)slot_vals
— list of slots that should be tracked
network
— reccurent network that handles dialogue policy managementname
—"go_bot_rnn"
,save_path
— name of the file that the model will be saved toload_path
— name of the file that the model will be loaded fromlearning_rate
— learning rate during trainingdropout_rate
— rate for dropout layer applied to input featureshidden_dim
— hidden state dimensiondense_size
— LSTM input sizeobs_size
— input features size (must be set to number ofbow_embedder
features,embedder
features,intent_classifier
features, context features(=2) plustracker
state size plus action size)action_size
— output action size
slot_filler
— model that predicts slot values for a given utterancename
— slot filler name ("dstc_slotfilling"
recommended, for implementation seedeeppavlov.models.ner
)- other slot filler arguments
intent_classifier
— model that outputs intents probability distribution for a given utterancename
— intent classifier name ("intent_model"
recommended, for implementation seedeeppavlov.models.classifiers.intents
)- classifier's other arguments
debug
— whether to display debug output (defaults tofalse
) (optional)
For a working exemplary config see deeeppavlov/configs/go_bot/gobot_dstc2.json
(model without embeddings).
A minimal model without slot_filler
, intent_classifier
and embedder
is configured in deeeppavlov/configs/go_bot/gobot_dstc2_minimal.json
.
A full model (with fasttext embeddings) configuration is in deeeppavlov/configs/go_bot/gobot_dstc2_all.json
- To infer from a pretrained model with config path equal to
path/to/config.json
:
from deeppavlov.core.commands.infer import build_model_from_config
from deeppavlov.core.common.file import read_json
CONFIG_PATH = 'path/to/config.json'
model = build_model_from_config(read_json(CONFIG_PATH))
utterance = ""
while utterance != 'exit':
print(">> " + model([utterance])[0])
utterance = input(':: ')
- To interact via command line use
deeppavlov/deep.py
script:
cd deeppavlov
python3 deep.py interact path/to/config.json
To be used for training, your config json file should include parameters:
dataset_reader
name
—"your_reader_here"
for a custom dataset or"dstc2_datasetreader"
to use DSTC2 (for implementation seedeeppavlov.dataset_readers.dstc2_dataset_reader
)data_path
— a path to a dataset file, which in case of"dstc2_datasetreader"
will be automatically downloaded from internet and placed todata_path
directory
dataset
— it should always be set to{"name": "dialog_dataset"}
(for implementation seedeeppavlov.datasets.dialog_dataset.py
)
See deeeppavlov/configs/go_bot/gobot_dstc2.json
for details.
The easiest way to run the training is by using deeppavlov/deep.py
script:
cd deeppavlov
python3 deep.py train path/to/config.json
The Hybrid Code Network model was trained and evaluated on a modification of a dataset from Dialogue State Tracking Challenge 2 [2]. The modifications were as follows:
- new actions
- bot dialog actions were concatenated into one action (example:
{"dialog_acts": ["ask", "request"]}
->{"dialog_acts": ["ask_request"]}
) - if a slot key was associated with the dialog action, the new act was a concatenation of an act and a slot key (example:
{"dialog_acts": ["ask"], "slot_vals": ["area"]}
->{"dialog_acts": ["ask_area"]}
)
- bot dialog actions were concatenated into one action (example:
- new train/dev/test split
- original dstc2 consisted of three different MDP polices, the original train and dev datasets (consisting of two polices) were merged and randomly split into train/dev/test
- minor fixes
- fixed several dialogs, where actions were wrongly annotated
- uppercased first letter of bot responses
- unified punctuation for bot responses'
If your model uses DSTC2 and relies on dstc2_datasetreader
DatasetReader
, all needed files, if not present in the dataset_reader.data_path
directory, will be downloaded from internet.
If your model needs to be trained on different data, you have several ways of achieving that (sorted by increase in the amount of code):
-
Use
"dialog_dataset"
in dataset config section and"dstc2_datasetreader"
in dataset reader config section (the simplest, but not the best way):- set
dataset.data_path
to your data directory; - your data files should have the same format as expected in
deeppavlov.dataset_readers.dstc2_dataset_reader:DSTC2DatasetReader.read()
function.
- set
-
Use
"dialog_dataset"
in dataset config section and"your_dataset_reader"
in dataset reader config section (recommended):- clone
deeppavlov.dataset_readers.dstc2_dataset_reader:DSTC2DatasetReader
toYourDatasetReader
; - register as
"your_dataset_reader"
; - rewrite so that it implements the same interface as the origin. Particularly,
YourDatasetReader.read()
must have the same output asDSTC2DatasetReader.read()
:train
— training dialog turns consisting of tuples:- first tuple element contains first user's utterance info
text
— utterance stringintents
— list of string intents, associated with user's utterancedb_result
— a database response (optional)episode_done
— set totrue
, if current utterance is the start of a new dialog, andfalse
(or skipped) otherwise (optional)
- second tuple element contains second user's response info
text
— utterance stringact
— an act, associated with the user's utterance
- first tuple element contains first user's utterance info
valid
— validation dialog turns in the same formattest
— test dialog turns in the same format
- clone
#TODO: change str act
to a list of acts
- Use your own dataset and dataset reader (if 2. doesn't work for you):
- your
YourDataset.iter()
class method output should match the input format forHybridCodeNetworkBot.train()
.
- your
As far as our dataset is a modified version of official DSTC2-dataset [2], resulting metrics can't be compared with evaluations on the original dataset.
But comparisons for bot model modifications trained on out DSTC2-dataset are presented:
Model | Config | Test action accuracy | Test turn accuracy |
---|---|---|---|
basic bot | gobot_dstc2_minimal.json |
0.5271 | 0.4853 |
bot with slot filler & fasttext embeddings | 0.5305 | 0.5147 | |
bot with slot filler & intents | gobot_dstc2.json |
0.5436 | 0.5261 |
bot with slot filler & intents & embeddings | gobot_dstc2_all.json |
0.5307 | 0.5145 |
#TODO: add dialog accuracies