Does multimodal-maestro support Distributed Data Parallel (DDP) training? #42

David-19940718 · 2024-09-14T02:30:27Z

Search before asking

I have searched the Multimodal Maestro issues and found no similar feature requests.

Question

Description

I'm interested in using maestro for a project that requires distributed training across multiple GPUs. I'd like to know if the project currently supports Distributed Data Parallel (DDP) training, which is a common approach for scaling deep learning models.

Questions

Does maestro currently support DDP training?
If yes, is there any documentation or examples showing how to set up and use DDP with this project?
If no, are there any plans to implement DDP support in the future?

Additional Context

DDP training can significantly speed up the training process for large models or datasets by utilizing multiple GPUs efficiently. It would be a valuable feature for users working with resource-intensive multimodal models.

Environment

Operating System: Ubuntu 20.04
Python version: 3.10
CUDA version (if applicable): 12.1
GPU model: NVIDIA RTX 3090

Thank you for your time and consideration!

Additional

No response

David-19940718 · 2024-09-14T03:32:13Z

I’ve encountered another question. :(

When training Florence2 with the default settings, the process appears to proceed without any issues. However, upon completing the training phase, the output generated_text is unexpected like that: '</s><s>9 of spades</s>'.

Could this be a bug in the framework, or is it possible that the model did not converge properly?

import os
import supervision as sv

from maestro.trainer.common.data_loaders.datasets import JSONLDataset
from maestro.trainer.models.florence_2.checkpoints import load_model

data_location = "datasets/poker cards.v4i.florence2-od"
processor, model = load_model(model_id_or_path="training/florence-2/1/checkpoints/best")

save_location = "training/florence-2/1/results"
os.makedirs(save_location, exist_ok=True)

ds = JSONLDataset(
    jsonl_file_path = f"{data_location}/valid/annotations.jsonl",
    image_directory_path = f"{data_location}/valid/"
)

image, _ = ds[2]
text = ""
task = ""

inputs = processor(text=text, images=image, return_tensors="pt").to("cuda")
generated_ids = model.generate(
    input_ids=inputs["input_ids"],
    pixel_values=inputs["pixel_values"],
    max_new_tokens=1024,
    num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(f'generated_text: {generated_text}')
response = processor.post_process_generation(generated_text, task=task, image_size=image.size)

detections = sv.Detections.from_lmm(sv.LMM.FLORENCE_2, response, resolution_wh=image.size)

box_annotator = sv.BoxAnnotator(color_lookup=sv.ColorLookup.INDEX)
label_annotator = sv.LabelAnnotator(color_lookup=sv.ColorLookup.INDEX)

image = box_annotator.annotate(image, detections)
image = label_annotator.annotate(image, detections)
image.thumbnail((600, 600))

# Save the annotated image
save_path = os.path.join(save_location, "annotated_image.png")
image.save(save_path)
print(f"Annotated image saved to: {save_path}")

David-19940718 · 2024-09-14T03:38:18Z

SkalskiP · 2024-09-16T10:14:28Z

Hi @David-19940718 👋🏻

W chwili obecnej maestro nie wspiera jeszcze DDP ale jak najbardziej mamy plany dodać takie wsparcie i to w niedalekiej przyszłości.

As for the </s><s>9 of spades</s> output format. This is expected. The Florence-2 model adds this on its own. You need to remove those extra tags in postporcessing.

David-19940718 · 2024-09-18T02:42:50Z

Thank you for your helpful response, @SkalskiP.

Regarding DDP support, I appreciate the update that there are plans to add this capability in the near future. That's great news and will be very useful.

As for the </s><s> tags in the output format, I've actually found that we need to explicitly set certain fields to get the expected results without those extra tags. Specifically, setting the following seems to resolve the issue:

text = "<OD>"
task = "<OD>"

This approach appears to prevent the Florence-2 model from adding those tags on its own. However, I'd be interested to hear if you have any insights on why this works or if there's a more ideal way to handle it.

Thanks again for looking into this and providing such a detailed explanation. Your help is much appreciated!

David-19940718 added the question Further information is requested label Sep 14, 2024

David-19940718 closed this as completed Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does multimodal-maestro support Distributed Data Parallel (DDP) training? #42

Does multimodal-maestro support Distributed Data Parallel (DDP) training? #42

David-19940718 commented Sep 14, 2024

David-19940718 commented Sep 14, 2024 •

edited

Loading

David-19940718 commented Sep 14, 2024

SkalskiP commented Sep 16, 2024

David-19940718 commented Sep 18, 2024 •

edited

Loading

Does multimodal-maestro support Distributed Data Parallel (DDP) training? #42

Does multimodal-maestro support Distributed Data Parallel (DDP) training? #42

Comments

David-19940718 commented Sep 14, 2024

Search before asking

Question

Description

Questions

Additional Context

Environment

Additional

David-19940718 commented Sep 14, 2024 • edited Loading

David-19940718 commented Sep 14, 2024

SkalskiP commented Sep 16, 2024

David-19940718 commented Sep 18, 2024 • edited Loading

David-19940718 commented Sep 14, 2024 •

edited

Loading

David-19940718 commented Sep 18, 2024 •

edited

Loading