Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readme & test fixes #780

Merged
merged 4 commits into from
Jan 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ A Unified Library for Parameter-Efficient and Modular Transfer Learning
<a href="https://arxiv.org/abs/2311.11077">Paper</a>
</h3>

![Tests](https://github.com/Adapter-Hub/adapters/workflows/Tests/badge.svg?branch=adapters)
![Tests](https://github.com/Adapter-Hub/adapters/workflows/Tests/badge.svg)
[![GitHub](https://img.shields.io/github/license/adapter-hub/adapters.svg?color=blue)](https://github.com/adapter-hub/adapters/blob/main/LICENSE)
[![PyPI](https://img.shields.io/pypi/v/adapters)](https://pypi.org/project/adapters/)

Expand All @@ -45,7 +45,7 @@ _Adapters_ provides a unified interface for efficient fine-tuning and modular tr

## Installation

`adapters` currently supports **Python 3.8+** and **PyTorch 1.10+**.
`adapters` currently supports **Python 3.9+** and **PyTorch 2.0+**.
After [installing PyTorch](https://pytorch.org/get-started/locally/), you can install `adapters` from PyPI ...

```
Expand Down Expand Up @@ -147,7 +147,7 @@ Currently, adapters integrates all architectures and methods listed below:

| Method | Paper(s) | Quick Links |
| --- | --- | --- |
| Bottleneck adapters | [Houlsby et al. (2019)](https://arxiv.org/pdf/1902.00751.pdf)<br> [Bapna and Firat (2019)](https://arxiv.org/pdf/1909.08478.pdf) | [Quickstart](https://docs.adapterhub.ml/quickstart.html), [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/01_Adapter_Training.ipynb) |
| Bottleneck adapters | [Houlsby et al. (2019)](https://arxiv.org/pdf/1902.00751.pdf)<br> [Bapna and Firat (2019)](https://arxiv.org/pdf/1909.08478.pdf)<br> [Steitz and Roth (2024)](https://openaccess.thecvf.com/content/CVPR2024/papers/Steitz_Adapters_Strike_Back_CVPR_2024_paper.pdf) | [Quickstart](https://docs.adapterhub.ml/quickstart.html), [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/01_Adapter_Training.ipynb) |
| AdapterFusion | [Pfeiffer et al. (2021)](https://aclanthology.org/2021.eacl-main.39.pdf) | [Docs: Training](https://docs.adapterhub.ml/training.html#train-adapterfusion), [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/03_Adapter_Fusion.ipynb) |
| MAD-X,<br> Invertible adapters | [Pfeiffer et al. (2020)](https://aclanthology.org/2020.emnlp-main.617/) | [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/04_Cross_Lingual_Transfer.ipynb) |
| AdapterDrop | [Rücklé et al. (2021)](https://arxiv.org/pdf/2010.11918.pdf) | [Notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/05_Adapter_Drop_Training.ipynb) |
Expand Down
2 changes: 1 addition & 1 deletion docs/installation.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Installation

The `adapters` package is designed as an add-on for Hugging Face's Transformers library.
It currently supports Python 3.8+ and PyTorch 1.10+. You will have to [install PyTorch](https://pytorch.org/get-started/locally/) first.
It currently supports Python 3.9+ and PyTorch 2.0+. You will have to [install PyTorch](https://pytorch.org/get-started/locally/) first.

```{eval-rst}
.. important::
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ def deps_list(*pkgs):
packages=find_packages("src"),
zip_safe=False,
extras_require=extras,
python_requires=">=3.8.0",
python_requires=">=3.9.0",
install_requires=install_requires,
classifiers=[
"Development Status :: 5 - Production/Stable",
Expand Down
4 changes: 2 additions & 2 deletions src/adapters/loading.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,10 +160,10 @@ def load_weights(
else:
logger.info(f"No safetensors file found in {save_directory}. Falling back to torch.load...")
weights_file = join(save_directory, self.weights_name)
state_dict = torch.load(weights_file, map_location="cpu")
state_dict = torch.load(weights_file, map_location="cpu", weights_only=True)
else:
weights_file = join(save_directory, self.weights_name)
state_dict = torch.load(weights_file, map_location="cpu")
state_dict = torch.load(weights_file, map_location="cpu", weights_only=True)
except Exception:
raise OSError("Unable to load weights from pytorch checkpoint file. ")
logger.info("Loading module weights from {}".format(weights_file))
Expand Down
2 changes: 1 addition & 1 deletion src/adapters/model_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ def load_embeddings(self, path: str, name: str):
embedding_path = os.path.join(path, EMBEDDING_FILE)
if not os.path.isfile(embedding_path):
raise FileNotFoundError("No embeddings found at {}".format(embedding_path))
weights = torch.load(embedding_path)
weights = torch.load(embedding_path, weights_only=True)

self.loaded_embeddings[name] = nn.Embedding.from_pretrained(weights)
self.set_active_embeddings(name)
Expand Down
28 changes: 20 additions & 8 deletions src/adapters/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,28 @@

import torch
from torch import nn
from torch.utils.data.dataset import Dataset
from torch.utils.data.dataset import Dataset, IterableDataset

from transformers import PreTrainedModel, Seq2SeqTrainer, Trainer, __version__
from transformers.configuration_utils import PretrainedConfig
from transformers.data.data_collator import DataCollator
from transformers.feature_extraction_utils import FeatureExtractionMixin
from transformers.image_processing_utils import BaseImageProcessor
from transformers.modeling_utils import unwrap_model
from transformers.processing_utils import ProcessorMixin
from transformers.tokenization_utils_base import PreTrainedTokenizerBase
from transformers.trainer_callback import TrainerCallback, TrainerControl, TrainerState
from transformers.trainer_utils import EvalPrediction
from transformers.training_args import TrainingArguments
from transformers.utils import CONFIG_NAME, WEIGHTS_NAME, is_sagemaker_mp_enabled, logging
from transformers.utils import CONFIG_NAME, WEIGHTS_NAME, is_datasets_available, is_sagemaker_mp_enabled, logging

from .composition import AdapterCompositionBlock, Fuse


if is_datasets_available():
import datasets


if is_sagemaker_mp_enabled():
import smdistributed.modelparallel.torch as smp

Expand All @@ -32,15 +39,19 @@ def __init__(
model: Union[PreTrainedModel, nn.Module] = None,
args: TrainingArguments = None,
data_collator: Optional[DataCollator] = None,
train_dataset: Optional[Dataset] = None,
eval_dataset: Optional[Dataset] = None,
train_dataset: Optional[Union[Dataset, IterableDataset, "datasets.Dataset"]] = None,
eval_dataset: Optional[Union[Dataset, Dict[str, Dataset], "datasets.Dataset"]] = None,
tokenizer: Optional[PreTrainedTokenizerBase] = None,
model_init: Callable[[], PreTrainedModel] = None,
processing_class: Optional[
Union[PreTrainedTokenizerBase, BaseImageProcessor, FeatureExtractionMixin, ProcessorMixin]
] = None,
model_init: Optional[Callable[[], PreTrainedModel]] = None,
compute_metrics: Optional[Callable[[EvalPrediction], Dict]] = None,
callbacks: Optional[List[TrainerCallback]] = None,
optimizers: Tuple[Optional[torch.optim.Optimizer], Optional[torch.optim.lr_scheduler.LambdaLR]] = (None, None),
preprocess_logits_for_metrics: Optional[Callable[[torch.Tensor, torch.Tensor], torch.Tensor]] = None,
adapter_names: Optional[List[List[str]]] = None,
optimizers: Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LambdaLR] = (None, None),
preprocess_logits_for_metrics: Callable[[torch.Tensor, torch.Tensor], torch.Tensor] = None,
**kwargs,
):
if model is not None:
model_quantized = getattr(model, "is_quantized", False)
Expand All @@ -51,12 +62,13 @@ def __init__(
data_collator,
train_dataset,
eval_dataset,
tokenizer=tokenizer,
processing_class=processing_class or tokenizer,
model_init=model_init,
compute_metrics=compute_metrics,
callbacks=[AdapterTrainerCallback(self)] + callbacks if callbacks else [AdapterTrainerCallback(self)],
optimizers=optimizers,
preprocess_logits_for_metrics=preprocess_logits_for_metrics,
**kwargs,
)
if model is not None:
model.is_quantized = model_quantized
Expand Down
2 changes: 1 addition & 1 deletion tests/composition/test_parallel.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ def run_parallel_training_test(self, adapter_config, filter_key):
do_train=True,
learning_rate=1.0,
max_steps=20,
no_cuda=True,
use_cpu=True,
remove_unused_columns=False,
)

Expand Down
2 changes: 1 addition & 1 deletion tests/extended/test_adapter_trainer_ext.py
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,7 @@ def run_trainer(
--per_device_eval_batch_size 4
--max_eval_samples 8
--val_max_target_length {max_len}
--evaluation_strategy steps
--eval_strategy steps
--eval_steps {str(eval_steps)}
--train_adapter
""".split()
Expand Down
8 changes: 4 additions & 4 deletions tests/methods/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,11 +192,11 @@ def run_load_test(self, adapter_config):
name = "dummy_adapter"
model1.add_adapter(name, config=adapter_config)
model1.set_active_adapters(name)
with tempfile.TemporaryDirectory() as temp_dir:
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as temp_dir:
model1.save_adapter(temp_dir, name)

# Check that there are actually weights saved
weights = torch.load(os.path.join(temp_dir, WEIGHTS_NAME), map_location="cpu")
weights = torch.load(os.path.join(temp_dir, WEIGHTS_NAME), map_location="cpu", weights_only=True)
self.assertTrue(len(weights) > 0)

# also tests that set_active works
Expand Down Expand Up @@ -225,7 +225,7 @@ def run_full_model_load_test(self, adapter_config):

name = "dummy"
model1.add_adapter(name, config=adapter_config)
with tempfile.TemporaryDirectory() as temp_dir:
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as temp_dir:
model1.save_pretrained(temp_dir)

model2, loading_info = load_model(temp_dir, self.model_class, output_loading_info=True)
Expand Down Expand Up @@ -256,7 +256,7 @@ def trainings_run(self, model, lr=1.0, steps=8):
do_train=True,
learning_rate=lr,
max_steps=steps,
no_cuda=True,
use_cpu=True,
per_device_train_batch_size=2,
remove_unused_columns=False,
)
Expand Down
8 changes: 4 additions & 4 deletions tests/test_adapter_conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def run_test(self, static_model, input_shape=None, label_dict=None):
):
self.skipTest("Skipping as base model classes are different.")

with tempfile.TemporaryDirectory() as temp_dir:
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as temp_dir:
static_model.save_head(temp_dir)

loading_info = {}
Expand Down Expand Up @@ -193,7 +193,7 @@ def test_equivalent_language_generation(self):
static_model.eval()
flex_model.eval()

with tempfile.TemporaryDirectory() as temp_dir:
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as temp_dir:
static_model.save_adapter(temp_dir, "dummy")

loading_info = {}
Expand All @@ -209,7 +209,7 @@ def test_equivalent_language_generation(self):
model_gen = static_model.generate(**input_samples)
flex_model_gen = flex_model.generate(**input_samples)

self.assertEquals(model_gen.shape, flex_model_gen.shape)
self.assertEqual(model_gen.shape, flex_model_gen.shape)
self.assertTrue(torch.equal(model_gen, flex_model_gen))

def test_full_model_conversion(self):
Expand All @@ -220,7 +220,7 @@ def test_full_model_conversion(self):
adapters.init(static_head_model)
static_head_model.eval()

with tempfile.TemporaryDirectory() as temp_dir:
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as temp_dir:
static_head_model.save_pretrained(temp_dir)

flex_head_model, loading_info = AutoAdapterModel.from_pretrained(temp_dir, output_loading_info=True)
Expand Down
2 changes: 1 addition & 1 deletion tests/test_adapter_embeddings.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def test_training_embedding(self):
do_train=True,
learning_rate=0.4,
max_steps=15,
no_cuda=True,
use_cpu=True,
per_device_train_batch_size=2,
label_names=["labels"],
)
Expand Down
2 changes: 1 addition & 1 deletion tests/test_adapter_fusion_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ def test_load_full_model_fusion(self):
model1.add_adapter(name2)
model1.add_adapter_fusion([name1, name2])
# save & reload model
with tempfile.TemporaryDirectory() as temp_dir:
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as temp_dir:
model1.save_pretrained(temp_dir)

model2 = load_model(temp_dir, self.model_class)
Expand Down
2 changes: 1 addition & 1 deletion tests/test_adapter_heads.py
Original file line number Diff line number Diff line change
Expand Up @@ -315,7 +315,7 @@ def test_load_full_model(self):
self.add_head(model, "dummy", layers=1)

true_config = model.get_prediction_heads_config()
with tempfile.TemporaryDirectory() as temp_dir:
with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as temp_dir:
# save
model.save_pretrained(temp_dir)
# reload
Expand Down
2 changes: 1 addition & 1 deletion tests/test_adapter_hub.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def test_load_task_adapter_from_hub(self):
overwrite_cache=True,
)
eval_dataset = GlueDataset(data_args, tokenizer=tokenizer, mode="dev")
training_args = TrainingArguments(output_dir="./examples", no_cuda=True)
training_args = TrainingArguments(output_dir="./examples", use_cpu=True)

# evaluate
trainer = Trainer(
Expand Down
8 changes: 4 additions & 4 deletions tests/test_adapter_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ def test_training_load_best_model_at_end_full_model(self):
save_steps=1,
remove_unused_columns=False,
load_best_model_at_end=True,
evaluation_strategy="epoch",
eval_strategy="epoch",
save_strategy="epoch",
num_train_epochs=2,
)
Expand Down Expand Up @@ -273,7 +273,7 @@ def test_training_load_best_model_at_end_adapter(self):
save_steps=1,
remove_unused_columns=False,
load_best_model_at_end=True,
evaluation_strategy="epoch",
eval_strategy="epoch",
save_strategy="epoch",
num_train_epochs=2,
)
Expand Down Expand Up @@ -309,7 +309,7 @@ def test_training_load_best_model_at_end_fusion(self):
save_steps=1,
remove_unused_columns=False,
load_best_model_at_end=True,
evaluation_strategy="epoch",
eval_strategy="epoch",
save_strategy="epoch",
num_train_epochs=2,
)
Expand Down Expand Up @@ -600,7 +600,7 @@ def forward(self, x):
output_dir=tempdir,
per_device_train_batch_size=1,
per_device_eval_batch_size=1,
evaluation_strategy="steps",
eval_strategy="steps",
logging_steps=10,
max_steps=5,
lr_scheduler_type="constant",
Expand Down
Loading