Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you please provide more examples to do inference on the different tasks in the paper? #234

Open
buaalyx opened this issue Jan 24, 2025 · 3 comments

Comments

@buaalyx
Copy link

buaalyx commented Jan 24, 2025

Such as temporal grounding on QVHighlight and Charade-STA

@dongdk
Copy link

dongdk commented Feb 5, 2025

+1

@arushirai1
Copy link

Same question here. I have tried getting an output in frames or seconds and both seem to perform poorly.

@Shuaicong97
Copy link

Any update? @shepnerd @buaalyx I tried to use InternVideo2/demo, since they're using the pretrained Bert, the feature dim is 512.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants