SERUM @ WACV 2023

SEmantic Data Engineering for Robustness Under Multimodal Settings

Website for SERUM Tutorial at WACV 2023, January 7, 2PM to 5PM

Hosted by Tejas Gokhale and Yezhou Yang (Arizona State University)

Agenda

In the past decade, we have witnessed a paradigm shift in computer vision -- the connection between vision and language (V+L) is now an integral part of AI. V+L comprises of human-interactive tasks such as visual question answering, image captioning, visual dialog, visual entailment and grounding, V+L navigation, and text-to-image generation. This field has already had an impact on other research communities such as NLP, robotics, graphics, and direct industrial implications for software, arts, media, and journalism. As V+L models become widely adopted, new types of challenges and failure modes are emerging, that have not been studied by previous work on robustness. Multi-modal tasks involving both vision and language (V+L) inputs, open up intriguing domain discrepancies that can affect model performance of test time.

In this tutorial, we will show how semantic data transformation -- i.e. data transformation guided by the knowledge of logical and semantic features of natural language, can

help improve the robustness of V+L models,
enable weakly supervised learning in cases with limited or no human-annotated datasets,
enhance the quality of outputs in generative settings such as captioning, and
guide multi-modal knowledge retrieval for knowledge-based visual question answering.

Tentative Schedule

Time (UTC-10)	Topic	Presenter
1400--1415	Welcome and Introduction	Yezhou Yang (Associate Professor, ASU
1415--1515	Plenary Talk: Towards Building Multimodal Foundation Models	Zhe Gan (Staff Research Scientist, Apple
1515--1600	Robust Semantic Vision with Knowledge-Guided Data Transformation	Tejas Gokhale (Ph.D. Candidate, ASU)
1600--1620	Enhancing Video Captioning with Commonsense Descriptions	Yezhou Yang (Associate Professor, ASU)
1620--1645	Visual-Retriever-Reader for Knowledge-based Question Answering	Man Luo (Ph.D. Candidate, ASU)
1645--1700	Concluding Remarks	Tejas Gokhale (Ph.D. Candidate, ASU)

This website will be updated closer to the event date.

We acknowledge support from NSF Robust Intelligence grant #2132724

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SERUM @ WACV 2023

SEmantic Data Engineering for Robustness Under Multimodal Settings

Agenda

Tentative Schedule

About

Releases

Packages

ASU-APG/serum

Folders and files

Latest commit

History

Repository files navigation

SERUM @ WACV 2023

SEmantic Data Engineering for Robustness Under Multimodal Settings

Agenda

Tentative Schedule

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages