Skip to content

Commit

Permalink
Allow saving, loading and pushing adapter compositions together (#771)
Browse files Browse the repository at this point in the history
Closes #441; closes #747.

This PR introduces a set of new methods for saving, loading and pushing
entire adapter compositions with one command:
- `save_adapter_setup()`
- `load_adapter_setup()`
- `push_adapter_setup_to_hub()`

They require two main params:
- `adapter_setup`: the adapter composition to be saved. Identical to
what can be specified for `active_adapters`
- `head_setup`: for models with heads, the head setup to save along with
the adapters. Identical to what can be specified for `active_head`

Docs
[here](https://github.com/adapter-hub/adapters/blob/04e69957a2bfc8093e2593186f7ebb2e71f88ec9/docs/loading.md#saving-and-loading-adapter-compositions)

### Example

```python
model = AutoAdapterModel.from_pretrained("roberta-base")

# create a complex setup
model.add_adapter("a", config=SeqBnConfig())
model.add_adapter("b", config=SeqBnConfig())
model.add_adapter("c", config=SeqBnConfig())
model.add_adapter_fusion(["a", "b"])
model.add_classification_head("head_a")
model.add_classification_head("head_b")
adapter_setup = Stack(Fuse("a", "b"), "c")
head_setup = BatchSplit("head_a", "head_b", batch_sizes=[1, 1])
model.set_active_adapters(adapter_setup)
model.active_head = head_setup

# save
model.save_adapter_setup("checkpoint", adapter_setup, head_setup=head_setup)

# push
model.push_adapter_setup_to_hub("calpt/random_adapter_setup_test", adapter_setup, head_setup=head_setup)

# re-load
# model2 = AutoAdapterModel.from_pretrained("roberta-base")
# model2.load_adapter_setup("checkpoint", set_active=True)
```

---------

Co-authored-by: Timo Imhof <timo.imhof.uni@gmail.com>
  • Loading branch information
calpt and TimoImhof authored Jan 8, 2025
1 parent 7c2357f commit 9edc20d
Show file tree
Hide file tree
Showing 9 changed files with 497 additions and 8 deletions.
2 changes: 2 additions & 0 deletions docs/adapter_composition.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ model.active_adapters = ac.Fuse("d", "e", "f")

To learn how training an _AdapterFusion_ layer works, check out [this Colab notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/03_Adapter_Fusion.ipynb) from the `adapters` repo.

To save and upload the full composition setup with adapters and fusion layer in one line of code, check out the docs on [saving and loading adapter compositions](loading.md#saving-and-loading-adapter-compositions).

### Retrieving AdapterFusion attentions

Finally, it is possible to retrieve the attention scores computed by each fusion layer in a forward pass of the model.
Expand Down
36 changes: 36 additions & 0 deletions docs/loading.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,3 +94,39 @@ We will go through the different arguments and their meaning one by one:
To load the adapter using a custom name, we can use the `load_as` parameter.

- Finally, `set_active` will directly activate the loaded adapter for usage in each model forward pass. Otherwise, you have to manually activate the adapter via `set_active_adapters()`.

## Saving and loading adapter compositions

In addition to saving and loading individual adapters, you can also save, load and share entire [compositions of adapters](adapter_composition.md) with a single line of code.
_Adapters_ provides three methods for this purpose that work very similar to those for single adapters:

- [`save_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter_setup) to save an adapter composition along with prediction heads to the local file system.
- [`load_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.load_adapter_setup) to load a saved adapter composition from the local file system or the Model Hub.
- [`push_adapter_setup_to_hub()`](adapters.hub_mixin.PushAdapterToHubMixin.push_adapter_setup_to_hub) to upload an adapter setup along with prediction heads to the Model Hub. See our [Hugging Face Model Hub guide](huggingface_hub.md) for more.

As an example, this is how you would save and load an AdapterFusion setup of three adapters with a prediction head:

```python
# Create an AdapterFusion
model = AutoAdapterModel.from_pretrained("bert-base-uncased")
model.load_adapter("sentiment/sst-2@ukp", config=SeqBnConfig(), with_head=False)
model.load_adapter("nli/multinli@ukp", config=SeqBnConfig(), with_head=False)
model.load_adapter("sts/qqp@ukp", config=SeqBnConfig(), with_head=False)
model.add_adapter_fusion(["sst-2", "mnli", "qqp"])
model.add_classification_head("clf_head")
adapter_setup = Fuse("sst-2", "mnli", "qqp")
head_setup = "clf_head"
model.set_active_adapters(adapter_setup)
model.active_head = head_setup

# Train AdapterFusion ...

# Save
model.save_adapter_setup("checkpoint", adapter_setup, head_setup=head_setup)

# Push to Hub
model.push_adapter_setup_to_hub("<user>/fusion_setup", adapter_setup, head_setup=head_setup)

# Re-load
# model.load_adapter_setup("checkpoint", set_active=True)
```
2 changes: 1 addition & 1 deletion docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ model = AutoAdapterModel.from_pretrained(example_path)
model.load_adapter(example_path)
```

Similar to how the weights of the full model are saved, the `save_adapter()` will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.
Similar to how the weights of the full model are saved, [`save_adapter()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter) will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.

Finally, if we have finished working with adapters, we can restore the base Transformer to its original form by deactivating and deleting the adapter:

Expand Down
35 changes: 35 additions & 0 deletions src/adapters/composition.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import itertools
import sys
import warnings
from collections.abc import Sequence
from typing import List, Optional, Set, Tuple, Union
Expand Down Expand Up @@ -45,6 +46,31 @@ def parallel_channels(self):
def flatten(self) -> Set[str]:
return set(itertools.chain(*[[b] if isinstance(b, str) else b.flatten() for b in self.children]))

def _get_save_kwargs(self):
return None

def to_dict(self):
save_dict = {
"type": self.__class__.__name__,
"children": [
c.to_dict() if isinstance(c, AdapterCompositionBlock) else {"type": "single", "children": [c]}
for c in self.children
],
}
if kwargs := self._get_save_kwargs():
save_dict["kwargs"] = kwargs
return save_dict

@classmethod
def from_dict(cls, data):
children = []
for child in data["children"]:
if child["type"] == "single":
children.append(child["children"][0])
else:
children.append(cls.from_dict(child))
return getattr(sys.modules[__name__], data["type"])(*children, **data.get("kwargs", {}))


class Parallel(AdapterCompositionBlock):
def __init__(self, *parallel_adapters: List[str]):
Expand Down Expand Up @@ -80,12 +106,18 @@ def __init__(self, *split_adapters: List[Union[AdapterCompositionBlock, str]], s
super().__init__(*split_adapters)
self.splits = splits if isinstance(splits, list) else [splits] * len(split_adapters)

def _get_save_kwargs(self):
return {"splits": self.splits}


class BatchSplit(AdapterCompositionBlock):
def __init__(self, *split_adapters: List[Union[AdapterCompositionBlock, str]], batch_sizes: Union[List[int], int]):
super().__init__(*split_adapters)
self.batch_sizes = batch_sizes if isinstance(batch_sizes, list) else [batch_sizes] * len(split_adapters)

def _get_save_kwargs(self):
return {"batch_sizes": self.batch_sizes}


class Average(AdapterCompositionBlock):
def __init__(
Expand All @@ -105,6 +137,9 @@ def __init__(
else:
self.weights = [1 / len(average_adapters)] * len(average_adapters)

def _get_save_kwargs(self):
return {"weights": self.weights}


# Mapping each composition block type to the allowed nested types
ALLOWED_NESTINGS = {
Expand Down
95 changes: 92 additions & 3 deletions src/adapters/hub_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

from transformers.utils.generic import working_or_temp_dir

from .composition import AdapterCompositionBlock


logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -35,7 +37,7 @@
from adapters import AutoAdapterModel
model = AutoAdapterModel.from_pretrained("{model_name}")
adapter_name = model.load_adapter("{adapter_repo_name}", set_active=True)
adapter_name = model.{load_fn}("{adapter_repo_name}", set_active=True)
```
## Architecture & Training
Expand Down Expand Up @@ -66,6 +68,7 @@ def _save_adapter_card(
language: Optional[str] = None,
license: Optional[str] = None,
metrics: Optional[List[str]] = None,
load_fn: str = "load_adapter",
**kwargs,
):
# Key remains "adapter-transformers", see: https://github.com/huggingface/huggingface.js/pull/459
Expand Down Expand Up @@ -103,6 +106,7 @@ def _save_adapter_card(
model_name=self.model_name,
dataset_name=dataset_name,
head_info=head_info,
load_fn=load_fn,
adapter_repo_name=adapter_repo_name,
architecture_training=kwargs.pop("architecture_training", DEFAULT_TEXT),
results=kwargs.pop("results", DEFAULT_TEXT),
Expand Down Expand Up @@ -133,8 +137,6 @@ def push_adapter_to_hub(
Args:
repo_id (str): The name of the repository on the model hub to upload to.
adapter_name (str): The name of the adapter to be uploaded.
organization (str, optional): Organization in which to push the adapter
(you must be a member of this organization). Defaults to None.
datasets_tag (str, optional): Dataset identifier from https://huggingface.co/datasets. Defaults to
None.
local_path (str, optional): Local path used as clone directory of the adapter repository.
Expand All @@ -156,6 +158,8 @@ def push_adapter_to_hub(
Branch to push the uploaded files to.
commit_description (`str`, *optional*):
The description of the commit that will be created
adapter_card_kwargs (Optional[dict], optional): Additional arguments to pass to the adapter card text generation.
Currently includes: tags, language, license, metrics, architecture_training, results, citation.
Returns:
str: The url of the adapter repository on the model hub.
Expand Down Expand Up @@ -190,3 +194,88 @@ def push_adapter_to_hub(
revision=revision,
commit_description=commit_description,
)

def push_adapter_setup_to_hub(
self,
repo_id: str,
adapter_setup: Union[str, list, AdapterCompositionBlock],
head_setup: Optional[Union[bool, str, list, AdapterCompositionBlock]] = None,
datasets_tag: Optional[str] = None,
local_path: Optional[str] = None,
commit_message: Optional[str] = None,
private: Optional[bool] = None,
token: Optional[Union[bool, str]] = None,
overwrite_adapter_card: bool = False,
create_pr: bool = False,
revision: str = None,
commit_description: str = None,
adapter_card_kwargs: Optional[dict] = None,
):
"""Upload an adapter setup to HuggingFace's Model Hub.
Args:
repo_id (str): The name of the repository on the model hub to upload to.
adapter_setup (Union[str, list, AdapterCompositionBlock]): The adapter setup to be uploaded. Usually an adapter composition block.
head_setup (Optional[Union[bool, str, list, AdapterCompositionBlock]], optional): The head setup to be uploaded.
datasets_tag (str, optional): Dataset identifier from https://huggingface.co/datasets. Defaults to
None.
local_path (str, optional): Local path used as clone directory of the adapter repository.
If not specified, will create a temporary directory. Defaults to None.
commit_message (:obj:`str`, `optional`):
Message to commit while pushing. Will default to :obj:`"add config"`, :obj:`"add tokenizer"` or
:obj:`"add model"` depending on the type of the class.
private (:obj:`bool`, `optional`):
Whether or not the repository created should be private (requires a paying subscription).
token (`bool` or `str`, *optional*):
The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`
is not specified.
overwrite_adapter_card (bool, optional): Overwrite an existing adapter card with a newly generated one.
If set to `False`, will only generate an adapter card, if none exists. Defaults to False.
create_pr (bool, optional):
Whether or not to create a PR with the uploaded files or directly commit.
revision (`str`, *optional*):
Branch to push the uploaded files to.
commit_description (`str`, *optional*):
The description of the commit that will be created
adapter_card_kwargs (Optional[dict], optional): Additional arguments to pass to the adapter card text generation.
Currently includes: tags, language, license, metrics, architecture_training, results, citation.
Returns:
str: The url of the adapter repository on the model hub.
"""
use_temp_dir = not os.path.isdir(local_path) if local_path else True

# Create repo or get retrieve an existing repo
repo_id = self._create_repo(repo_id, private=private, token=token)

# Commit and push
logger.info('Pushing adapter setup "%s" to model hub at %s ...', adapter_setup, repo_id)
with working_or_temp_dir(working_dir=local_path, use_temp_dir=use_temp_dir) as work_dir:
files_timestamps = self._get_files_timestamps(work_dir)
# Save adapter and optionally create model card
if head_setup is not None:
save_kwargs = {"head_setup": head_setup}
else:
save_kwargs = {}
self.save_adapter_setup(work_dir, adapter_setup, **save_kwargs)
if overwrite_adapter_card or not os.path.exists(os.path.join(work_dir, "README.md")):
adapter_card_kwargs = adapter_card_kwargs or {}
self._save_adapter_card(
work_dir,
str(adapter_setup),
repo_id,
datasets_tag=datasets_tag,
load_fn="load_adapter_setup",
**adapter_card_kwargs,
)
return self._upload_modified_files(
work_dir,
repo_id,
files_timestamps,
commit_message=commit_message,
token=token,
create_pr=create_pr,
revision=revision,
commit_description=commit_description,
)
Loading

0 comments on commit 9edc20d

Please sign in to comment.