Implementing RWKV-LLM (#37) #209

AvidEslami · 2023-06-01T05:15:33Z

This pull requests implements RWKV's RNN model using Modal. (Issue #37 ) (Reference)

Summary: Created RWKV.py, (will move to openadapt/strategies/mixins and restructure to accept inputs)

Run using the following command (will be changed once implemented as a mixin):

modal token new
modal run .\openadapt\RWKV\RWKV.py

Implementation:

Starts a Modal application, gpu must be set to a100 in order to run Raven-14B (largest model)
Downloads weights for the desired model from huggingface to the modal server
Computes and returns output

TODO:

Change structure to function as a mixin (take inputs such as prompts or task descriptions)
Allow for modification of parameters via config file (Temperature, Top P, Presence Penalty, etc...)
Deploy Modal application permanently to avoid startup times (Upon startup weights must be downloaded, to avoid this we can deploy the Modal app externally and make requests to it instead)

…r load and deployment

…comments

requirements.txt

…ription, created more structured test for use demo

abrichr · 2023-07-04T00:57:16Z

As mentioned in Slack, I think it's time to start fine tuning 🤓

…mpts with response concatenated at the end

…ciated with each task

… more epochs), prevent final newline character

…ow follows proper jsonl format

…d datasets keeps prompts and outputs seperate

…tly due to seemingly random output

…erent, runs on modal

…remain

AvidEslami · 2023-08-28T14:43:42Z

Signals Finetune Update:

The following sheet contains the outputs generated by the finetuned models link
Any advice on the approach is appreciated, there are still several things to try:
- Having a random numbers of signals, which could perhaps reduce the models chance of memorizing results
- Changing prompt structure could improve understanding
- Creating a more varied dataset (more signals / more tasks) help model catch relations
- Finetuning Pile-14B for several epochs (will try), Pile-14B is the non finetuned version of Raven-14B

As usual please let me know if you have any suggestions!

… be more general (RWKV 5 is releasing soon and it is worth testing)

AvidEslami and others added 12 commits May 24, 2023 16:16

test RWKV

178f0d2

switch to using modal

863b250

Merge branch 'MLDSAI:main' into RWKV

0a8838b

clean up test_code

0ff9955

removed timeout, import model from huggingface, works for 7B

7300b7e

large models load, testing performance and uses, will look into faste…

d24beb7

…r load and deployment

undo

1fadf2d

formatting

2424a1e

improvments made to prompt

2418fb5

improve prompt formatting

f742dd5

fixed mounting to use tokenizer from repository, removed unnecessary …

e8ffc75

…comments

change approach, now uses parameters such as temperature

f0f1b0c

abrichr reviewed Jun 2, 2023

View reviewed changes

requirements.txt Outdated Show resolved Hide resolved

AvidEslami and others added 17 commits June 2, 2023 19:41

switched to downloading tokenizer on launch

17e5be5

added versions to requirements

8cec909

try calling run_RWKV from seperate file

797b68c

using deploy, ready to function as mixin

d887271

change stub and create mixin

c3ef228

simplified prompting

0ec1f27

switch to loading parameters from config.py

d756244

loads parameters

6b32766

Merge branch 'MLDSAI:main' into RWKV

4152cbd

modified input to make it closer to real life scenarios

b8f3a4b

minor formatting improvements

53406fc

more model types and less tests

c2d7af3

user now chooses which model to use

f670356

added new parameter, removed default input,instruction, and task desc…

82e834f

…ription, created more structured test for use demo

fix mixin

59f2bc0

fix mixin case

cc3ed61

allow users to run test_RWKV without modal

8d5cb00

AvidEslami added 24 commits July 7, 2023 13:55

experimenting with prompt generation scripts

415931d

created evaluate template

b2f7c88

increased populations of lists

80289ca

remove relevance of dataframe to linkedin update

05afbf5

filled evaluate, just needs to call model

3dfba54

random includes all signals, generate_dataset creates X number of pro…

0385fa1

…mpts with response concatenated at the end

signals are rearranged, model will no longer memorize simply ids asso…

6227f88

…ciated with each task

resolved merge conflicts

ee13d5e

Increase dataset size to 5000 (larger dataset is preferred vs. having…

75c83fa

… more epochs), prevent final newline character

fixed dataset, doesn't have trailing commas at the end of each line n…

6f1b68e

…ow follows proper jsonl format

fixed spreadsheet prompt grammar

8745cdc

desired outputs are sorted, finetune is in progress, generate labelle…

44dc443

…d datasets keeps prompts and outputs seperate

finetunes, but model not saved

7bde832

config file is being saved, but doesn't seem to load tokenizer correc…

ae3f527

…tly due to seemingly random output

added comparison test

c49999d

attempt with simpler finetune test?

ccd41b1

Model can be reloaded after training though results are slightly diff…

150b20e

…erent, runs on modal

Model finetunes and saves properly, upon reloading finetuned details …

e9c1106

…remain

cleaned finetune code

81d18b3

trying on larger rwkv models

ce7a0d0

added modal structure to finetune test

8bfb575

added docstring + Black

384449b

minor grammar fix

f01883f

minor improvements to prompt

8109f10

AvidEslami added 3 commits August 28, 2023 12:55

updated prompts script

dba5b6d

increase dataset size

b5310b8

Removed RWKV world models for code clarity and changed test script to…

2b833cb

… be more general (RWKV 5 is releasing soon and it is worth testing)

AvidEslami marked this pull request as draft February 4, 2024 19:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing RWKV-LLM (#37) #209

Implementing RWKV-LLM (#37) #209

AvidEslami commented Jun 1, 2023 •

edited

Loading

abrichr commented Jul 4, 2023

AvidEslami commented Aug 28, 2023

Implementing RWKV-LLM (#37) #209

Are you sure you want to change the base?

Implementing RWKV-LLM (#37) #209

Conversation

AvidEslami commented Jun 1, 2023 • edited Loading

abrichr commented Jul 4, 2023

AvidEslami commented Aug 28, 2023

AvidEslami commented Jun 1, 2023 •

edited

Loading