PyTorch GPT and Textcat #12346

kbmmoran · 2023-02-28T23:52:59Z

kbmmoran
Feb 28, 2023

Hi all, sanity check here.

I have a multi-label textcat model, trained using the tok2vec pipe. I would like to work on improving the model by integrating a custom trained GPT PyTorch model to power this textcat component. This would be as simple as registering the architecture and adding the GPT to the config file?

I know BERT architecture models usually perform better with text classification tasks, and RoBERTa is the default transformer model to use with Spacy. However, my understanding of the Spacy architecture is that a GPT model should, in theory, work similarity well to a BERT model in powering the textcat component, as it is powering the component, we not finetuning the transformer model to make the predictions.

I apologize if I have misunderstood any of the ideas listed here and appreciate any help/guidance :)

adrianeboyd · 2023-03-01T08:04:13Z

adrianeboyd
Mar 1, 2023

The basics for how to specify a different transformer model: #10768

I know that gpt2 works with one additional tokenizer config setting:

{"use_fast": True, "pad_token": "<|endoftext|>"}

For any other models our best advice is to try it out. Not all huggingface models will work with spacy-transformers or lead to good results for a particular task, but it should be easy to try them out and see.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch GPT and Textcat #12346

{{title}}

Replies: 1 comment

{{title}}

Select a reply

PyTorch GPT and Textcat #12346

kbmmoran Feb 28, 2023

Replies: 1 comment

adrianeboyd Mar 1, 2023

kbmmoran
Feb 28, 2023

adrianeboyd
Mar 1, 2023