Update per token reward #25

ljvmiranda921 · 2024-02-14T20:32:35Z

Closes #22

Kinda big PR that adds two new scripts:

get_per_token_reward that computes the per-substring reward and then keeps it in a directory. The name of the directory is the hashed value of the input text.
draw_per_token_reward accepts a token hash and draws the visualization as seen below. I just made them into a single heatmap so that the spacing is easier to manage. Note that the colorbar spacing might look different as we add new models.

How this works

python -m analysis.per_token_reward "hi there general kenobi" --model IDEA-CCNL/Ziya-LLaMA-7B-Reward

Say the hashed value of the text "hi there general kenobi" is 71b1349c9b, it will then create a directory abcdefgh123 with a filename based on the combined hash of the model name and chat_template:

71b1349c9b/
    59761c2c8a.json

And then the JSON file looks like this:

{
    "text": "hi there general kenobi",
    "text_hash": "71b1349c9b",
    "model": "IDEA-CCNL/Ziya-LLaMA-7B-Reward",
    "chat_template": "tulu",
    "model_chat_hash": "59761c2c8a",
    "substrings": [
        "hi",
        "hi there",
        "hi there general",
        "hi there general k",
        "hi there general ken",
        "hi there general kenobi"
    ],
    "tokens": [
        "hi",
        "there",
        "general",
        "k",
        "en",
        "obi"
    ],
    "rewards": [
        -1.9443359375,
        -1.1064453125,
        0.26318359375,
        1.2578125,
        1.208984375,
        3.24609375
    ]
}

Ideally we store this in a file somewhere where we have a directory of text prompts (and model rewards in JSON). To create the visualization, we use the script draw_per_token_reward. We pass the hash and a path to the output PNG file:

python -m analysis.draw_per_token_reward 71b1349c9b test.png

Model name is actually a HuggingFace model.

analysis/draw_per_token_reward.py

ljvmiranda921 · 2024-02-20T02:10:13Z

analysis/get_per_token_reward.py

+    "oasst": {
+        "model_builder": AutoModelForSequenceClassification.from_pretrained,
+        "pipeline_builder": pipeline,
+        "quantized": True,
+        "custom_dialogue": False,
+        "models": [
+            "OpenAssistant/oasst-rm-2-pythia-6.9b-epoch-1",
+            "OpenAssistant/oasst-rm-2.1-pythia-1.4b-epoch-2.5",
+            "OpenAssistant/reward-model-deberta-v3-base",
+            "OpenAssistant/reward-model-deberta-v3-large",
+            "OpenAssistant/reward-model-deberta-v3-large-v2",
+            "OpenAssistant/reward-model-electra-large-discriminator",
+        ],


Not sure if this is the best approach. I manually listed possible reward models per "group." I feel like the initial approach that looks for substrings might introduce some hidden errors. In this case, it will only error out if it's not added to models.

Yeah, tokenizers gonna tokenizer. If you're happy with it, it's okay for now.

natolambert

Some general notes as I go through:

I also think a line plot could be cool. We can iterate on that if anyone is interested in the topic further :)
What happens with Per token multiple rms #29 , did y'all coordinate on this?
Regarding tokenizer, if it's a line plot we could just plot per character (and some tokens jump multiple on the x axis), which may make some of the code less painful
The model config stuff can maybe be a separate PR. We can just expand it after this is merged.
I'm a little wary of spacy-alignments package. Not well used, and I'm not convinced we need to go in that direction. Lmk what you think.

Anyways, approving if you feel strongly about it, but we'll prolly discuss more before merging.

ljvmiranda921 · 2024-02-20T04:36:06Z

Thanks for the review!

Sure, I can try a line chart, but might be able to start working on it the latter half of this week 👍
Ah yeah I used it as reference for this PR. I think we can close Per token multiple rms #29 once this PR is merged, any thoughts @khyathiraghavi ?
Sure! Same as Multiple styles of computing reward with DPO #1
Hmm, right now it's tightly coupled so it might be hard to separate them. We can merge this first, then I can make another PR that separates them (or maybe that's what you meant in the first place?).
I initially wrote a naive if substring in token logic but I encountered many edge-cases. I looked around and found that this is the most robust so far. If you're still wary, we can put the dependency in requirements.txt since spacy-alignments is not used in the core training scripts.

khyathiraghavi · 2024-02-20T04:49:25Z

Yes, we coordinated and @ljvmiranda921 merged my PR https://github.com/allenai/herm/tree/per-token-multiple-rms in the current version, it looks good to me!
We can also consider a good RM as a base and use that tokenizer in place of white space and compare the rest against this in the future -- the current version looks good to me for now.

ljvmiranda921 added 2 commits February 13, 2024 19:47

Make parser arguments more readable

28f1ab7

Update model loading

8dee147

ljvmiranda921 self-assigned this Feb 14, 2024

Implement results saving and hashing

4fb0368

ljvmiranda921 force-pushed the update/per-token-reward branch from 0818e8f to 4fb0368 Compare February 19, 2024 19:30

ljvmiranda921 added 2 commits February 19, 2024 11:38

Preserve tokenization artifacts

2cca0ee

Fix model name implementation

0ee01f4

Model name is actually a HuggingFace model.

ljvmiranda921 force-pushed the update/per-token-reward branch from b9ce5b7 to 0ee01f4 Compare February 19, 2024 21:50

ljvmiranda921 added 5 commits February 19, 2024 13:51

Improve structure and add verbs

69835a0

Update visualization and try alignment

7af4900

Add spacy-alignments for aligning tokens

58f1f9d

Fix visualization code

95db8ca

Add data to .gitignore

bd02e5f

ljvmiranda921 force-pushed the update/per-token-reward branch from 34e0bb3 to bd02e5f Compare February 20, 2024 01:49

ljvmiranda921 changed the title ~~WIP: Update per token reward~~ Update per token reward Feb 20, 2024

ljvmiranda921 requested review from natolambert and khyathiraghavi February 20, 2024 01:49

Ensure that make quality passes

9ac6660

ljvmiranda921 marked this pull request as ready for review February 20, 2024 02:01

ljvmiranda921 commented Feb 20, 2024

View reviewed changes

analysis/draw_per_token_reward.py Show resolved Hide resolved

ljvmiranda921 commented Feb 20, 2024

View reviewed changes

natolambert approved these changes Feb 20, 2024

View reviewed changes

khyathiraghavi approved these changes Feb 20, 2024

View reviewed changes

khyathiraghavi merged commit e1ed3c5 into main Feb 20, 2024
3 checks passed

ljvmiranda921 deleted the update/per-token-reward branch February 20, 2024 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update per token reward #25

Update per token reward #25

ljvmiranda921 commented Feb 14, 2024 •

edited

Loading

ljvmiranda921 Feb 20, 2024

natolambert Feb 20, 2024

natolambert left a comment

ljvmiranda921 commented Feb 20, 2024

khyathiraghavi commented Feb 20, 2024

Update per token reward #25

Update per token reward #25

Conversation

ljvmiranda921 commented Feb 14, 2024 • edited Loading

How this works

ljvmiranda921 Feb 20, 2024

Choose a reason for hiding this comment

natolambert Feb 20, 2024

Choose a reason for hiding this comment

natolambert left a comment

Choose a reason for hiding this comment

ljvmiranda921 commented Feb 20, 2024

khyathiraghavi commented Feb 20, 2024

ljvmiranda921 commented Feb 14, 2024 •

edited

Loading