Add Background Compilation for OWLv2 Vision Model #1026

lrosemberg · 2025-02-13T20:12:34Z

Description

Added a singleton manager to handle OWLv2's vision model compilation in background, avoiding the initial 2-minute blocking compilation while maintaining the performance benefits of compiled models.

Changes

Added OWLv2ModelManager class that manages vision model compilation per huggingface_id
Compilation happens in a daemon thread, allowing immediate model usage
Uses thread-safe singleton pattern to ensure one compiled model instance per huggingface_id
Integrated with existing Owlv2Singleton class
No changes to the model's API or behavior

Example Usage

model = OwlV2(model_id=f"owlv2/{OWLV2_VERSION_ID}")
# Model is immediately usable, compilation happens in background
response = model.infer_from_request(request)

Benefits

Eliminates 2-minute blocking during model initialization
Maintains compiled model performance benefits
Thread-safe
Transparent to existing code using the model

Potential Risks

First inference might be slightly slower (using uncompiled model)
Daemon threads might be killed during program termination (acceptable as compilation is non-critical)

Type of change

New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Added tests to verify:

Singleton behavior per huggingface_id
Proper garbage collection of manager instances

Any specific deployment considerations

Deploy inference internal

CLAassistant · 2025-02-13T20:12:40Z

All committers have signed the CLA.

inference/models/owlv2/owlv2.py

…low/inference into lean/singleton-owlv2-model-compile

lrosemberg · 2025-02-20T17:34:53Z

Hello @PawelPeczek-Roboflow, In addition to local testing, it was also deployed to the inference-internal of staging and production, and solved the problems we had with the long running in inference for instant models, which previously caused an error because the timeout on the nginx side of the inference-internal was 30s, now it runs around 3s.

That's a smarter way to do the previous COMPILE_OWLV2_MODEL thing. Instead of not compiling and always having slower inferences, the first time we return the non-compiled instance and start the compilation process in the background and after compiling we replace the instance, making the next inferences faster.

Slack threads for context: results, discussion thread

lrosemberg added 2 commits February 13, 2025 17:08

creates OWLv2ModelManager

8dc57c7

tests for OWLv2ModelManager

9357fba

lrosemberg self-assigned this Feb 13, 2025

lrosemberg requested review from PawelPeczek-Roboflow, grzegorz-roboflow, yeldarby, probicheaux, hansent and EmilyGavrilenko as code owners February 13, 2025 20:12

lrosemberg added 5 commits February 13, 2025 17:16

format

df5d71f

format 2

afcd2ce

isort

9153532

typo

bf7c1e1

improvements

b7d9b07

grzegorz-roboflow requested changes Feb 14, 2025

View reviewed changes

inference/models/owlv2/owlv2.py Show resolved Hide resolved

inference/models/owlv2/owlv2.py Show resolved Hide resolved

lrosemberg requested a review from grzegorz-roboflow February 14, 2025 18:18

probicheaux added 2 commits February 17, 2025 23:31

Add logs

f2fee36

Merge branch 'main' into lean/singleton-owlv2-model-compile

d58223b

grzegorz-roboflow approved these changes Feb 18, 2025

View reviewed changes

formatting

968feef

grzegorz-roboflow previously approved these changes Feb 18, 2025

View reviewed changes

grzegorz-roboflow dismissed their stale review via 968feef February 18, 2025 09:25

grzegorz-roboflow previously approved these changes Feb 18, 2025

View reviewed changes

grzegorz-roboflow and others added 4 commits February 18, 2025 10:25

Merge branch 'main' into lean/singleton-owlv2-model-compile

e404965

Merge branch 'main' into lean/singleton-owlv2-model-compile

780607b

Merge branch 'lean/singleton-owlv2-model-compile' of github.com:robof…

b612635

…low/inference into lean/singleton-owlv2-model-compile

fix log

1197c7c

lrosemberg dismissed grzegorz-roboflow’s stale review via 1197c7c February 18, 2025 18:48

Merge branch 'main' into lean/singleton-owlv2-model-compile

76564ca

lrosemberg requested a review from grzegorz-roboflow February 20, 2025 18:02

PawelPeczek-Roboflow approved these changes Feb 21, 2025

View reviewed changes

Merge branch 'main' into lean/singleton-owlv2-model-compile

a2019a6

grzegorz-roboflow merged commit 31fc934 into main Feb 21, 2025
31 checks passed

grzegorz-roboflow deleted the lean/singleton-owlv2-model-compile branch February 21, 2025 10:32

PawelPeczek-Roboflow mentioned this pull request Feb 21, 2025

Revert "Add Background Compilation for OWLv2 Vision Model" #1046

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Background Compilation for OWLv2 Vision Model #1026

Add Background Compilation for OWLv2 Vision Model #1026

lrosemberg commented Feb 13, 2025 •

edited

Loading

CLAassistant commented Feb 13, 2025 •

edited

Loading

lrosemberg commented Feb 20, 2025

Add Background Compilation for OWLv2 Vision Model #1026

Add Background Compilation for OWLv2 Vision Model #1026

Conversation

lrosemberg commented Feb 13, 2025 • edited Loading

Description

Changes

Example Usage

Benefits

Potential Risks

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

CLAassistant commented Feb 13, 2025 • edited Loading

lrosemberg commented Feb 20, 2025

lrosemberg commented Feb 13, 2025 •

edited

Loading

CLAassistant commented Feb 13, 2025 •

edited

Loading