Transformer with Controlled Attention for Synchronous Motion Captioning
computer-vision transformer semantic-segmentation action-recognition phrase-grounding action-localization 3d-human-motion temporal-grounding motion-to-text human-motion-composition synchronized-captioning
-
Updated
Dec 14, 2024