O(n^m) to O(n) for finding no target names #2372

AllenHW · 2025-02-10T09:32:59Z

The code snippet finds modules that are not targeted by the LoRA adaptor.

Previous implementation is a double for-loop along the modules in the model and lora targets, and has a O(n*m) runtime, where n can be up to a 1000 and m can be up to 500 depending on the LoRA. The logic is meant to find model modules that don't contain a suffix (starting with a '.' or the beginning of the word) found in LoRA targets.

Instead of a double for loop, we could split module names by '.' to find all potential suffixes, and check if any of them are contained in the LoRA targets, which have been turned into a lookup table. Module names are split into less than 10 suffixes, so it is effectively an O(n) operation

This change reduces the latency of load_lora_weights() by around 0.6 seconds on an Azure A100 machine, for a 300MB Flux adaptor (kishlaykumar1995/blinky-flux-lora-32). When the lora state_dict is loaded on the GPU already, load_lora_weights() used to take around 1.1 secs, so it achieves a 50% reduction in latency of applying LoRA

BenjaminBossan

Thanks for suggesting this optimization. Could you please share the snippet you used to measure the time improvement?

While reviewing your code, I noticed that it's very similar to the code that is already in _find_minimal_target_modules:

peft/src/peft/tuners/tuners_utils.py

Lines 916 to 921 in 40fe166

    
           def generate_suffixes(s): 
        
               parts = s.split(".") 
        
               return [".".join(parts[i:]) for i in range(len(parts))][::-1] 
        
           # Create a reverse lookup for other_module_names to quickly check suffix matches 
        
           other_module_suffixes = {suffix for item in other_module_names for suffix in generate_suffixes(item)}

Therefore, I wonder if we could not eliminate the generation of names_no_target altogether since what we really need are the suffixes, not the full module names. For this, we would need to pass the key_list to _find_minimal_target_modules instead of the no longer required names_no_target and then derive the suffixes directly.

In principle, changing the signature like this is fine, since it's a fully private function, but it would mean rewriting the unit tests. I think the rewrite would not be so hard, it would come down to:

    def test_find_minimal_target_modules(self, target_modules, other_module_names, expected):
        # check all possible combinations of list and set
-       result = find_minimal_target_modules(target_modules, other_module_names)
+       all_module_names = target_modules + other_module_names
+       result = find_minimal_target_modules(target_modules, all_module_names)
        assert result == expected

(and then also adjusting the set vs list tests)

What do you think about this potential further optimization?

And also, please run make style on your changes to satisfy the linter.

AllenHW added 2 commits February 10, 2025 09:22

O(n^m) to O(n) for finding no target names

ed79f06

naming improvement

5dc9fdf

BenjaminBossan reviewed Feb 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

O(n^m) to O(n) for finding no target names #2372

O(n^m) to O(n) for finding no target names #2372

AllenHW commented Feb 10, 2025 •

edited

Loading

BenjaminBossan left a comment

	def generate_suffixes(s):
	parts = s.split(".")
	return [".".join(parts[i:]) for i in range(len(parts))][::-1]

	# Create a reverse lookup for other_module_names to quickly check suffix matches
	other_module_suffixes = {suffix for item in other_module_names for suffix in generate_suffixes(item)}

O(n^m) to O(n) for finding no target names #2372

Are you sure you want to change the base?

O(n^m) to O(n) for finding no target names #2372

Conversation

AllenHW commented Feb 10, 2025 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

AllenHW commented Feb 10, 2025 •

edited

Loading