Implement aliasable mixin and alias activation ordering (python3.9 fix) #218

kylesayrs · 2024-12-02T15:47:31Z

Purpose

Add aliases for activation ordering options to improve usability for researchers familiar with autogptq
Update previous Implement aliasable mixin and alias activation ordering #213 to support python<=3.9

Changes

Implement AliasableEnum mixin which allows enums to be aliased
Add AliasableEnum to ActivationOrdering with the following map:

{
    "dynamic": "group",
    "static": "weight",
}

Remove the @property decorator from aliases to avoid interactions in python<=3.9 https://bugs.python.org/issue43682

Testing

Added passing tests in tests/test_quantization/test_quant_args.py
Successfully quantized a model using dynamic actorder and tested e2e with vllm
Tested the above using python3.9

llama3.py

from accelerate import cpu_offload
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer

from llmcompressor.modifiers.quantization import GPTQModifier
from llmcompressor.transformers import oneshot

# Select model and load it.
# MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct"
MODEL_ID = "meta-llama/Llama-3.2-1B-Instruct"

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    device_map="cuda:0",
    torch_dtype="auto",
)
# cpu_offload(model)
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)

# Select calibration dataset.
DATASET_ID = "HuggingFaceH4/ultrachat_200k"
DATASET_SPLIT = "train_sft"

# Select number of samples. 512 samples is a good place to start.
# Increasing the number of samples can improve accuracy.
NUM_CALIBRATION_SAMPLES = 285  # 2048
MAX_SEQUENCE_LENGTH = 2048

# Load dataset and preprocess.
ds = load_dataset(DATASET_ID, split=DATASET_SPLIT)
ds = ds.shuffle(seed=42).select(range(NUM_CALIBRATION_SAMPLES))


def preprocess(example):
    return {
        "text": tokenizer.apply_chat_template(
            example["messages"],
            tokenize=False,
        )
    }


ds = ds.map(preprocess)


# Tokenize inputs.
def tokenize(sample):
    return tokenizer(
        sample["text"],
        padding=False,
        max_length=MAX_SEQUENCE_LENGTH,
        truncation=True,
        add_special_tokens=False,
    )


ds = ds.map(tokenize, remove_columns=ds.column_names)

# Configure the quantization algorithm to run.
from compressed_tensors.quantization import QuantizationArgs, QuantizationType, QuantizationStrategy, ActivationOrdering, QuantizationScheme
recipe = GPTQModifier(
    targets="Linear",
    config_groups={
        "config_group": QuantizationScheme(
            targets=["Linear"],
            weights=QuantizationArgs(
                num_bits=4,
                type=QuantizationType.INT,
                strategy=QuantizationStrategy.GROUP,
                group_size=128,
                symmetric=True,
                dynamic=False,
                actorder="dynamic",
            ),
        ),
    },
    ignore=["lm_head"],
    dampening_frac=0.5
)

# Apply algorithms.
oneshot(
    model=model,
    dataset=ds,
    recipe=recipe,
    max_seq_length=MAX_SEQUENCE_LENGTH,
    num_calibration_samples=NUM_CALIBRATION_SAMPLES,
)

# Confirm generations of the quantized model look sane.
print("\n\n")
print("========== SAMPLE GENERATION ==============")
input_ids = tokenizer("Hello my name is", return_tensors="pt").input_ids.to("cuda")
output = model.generate(input_ids, max_new_tokens=100)
print(tokenizer.decode(output[0]))
print("==========================================\n\n")

# Save to disk compressed.
SAVE_DIR = MODEL_ID.split("/")[1] + "-W4A16-G128"
model.save_pretrained(SAVE_DIR, save_compressed=True)
tokenizer.save_pretrained(SAVE_DIR)

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs · 2024-12-02T17:06:47Z

I've confirmed that the tests which were previously failing on python3.9 are now passing on python3.9

kylesayrs added 6 commits November 25, 2024 19:33

implement aliasable mixin and alias activation ordering

9246772

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

update docstring

3da8953

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix docstring

7c47282

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

uncomment

569a896

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

rename and make abstract

68436f5

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

remove property for clarity and to support python3.9

3fcd2c7

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs changed the title ~~Kylesayrs/actorder aliases~~ Implement aliasable mixin and alias activation ordering (python3.9 fix) Dec 2, 2024

mgoin approved these changes Dec 2, 2024

View reviewed changes

mgoin merged commit 2dcbc9d into main Dec 2, 2024
1 check passed

mgoin deleted the kylesayrs/actorder-aliases branch December 2, 2024 19:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement aliasable mixin and alias activation ordering (python3.9 fix) #218

Implement aliasable mixin and alias activation ordering (python3.9 fix) #218

kylesayrs commented Dec 2, 2024

kylesayrs commented Dec 2, 2024

Implement aliasable mixin and alias activation ordering (python3.9 fix) #218

Implement aliasable mixin and alias activation ordering (python3.9 fix) #218

Conversation

kylesayrs commented Dec 2, 2024

Purpose

Changes

Testing

kylesayrs commented Dec 2, 2024