NFDI4DS | UHH-SEMS - Publication Details

Egocentric Video-Language Pretraining @ Ego4D Challenge 2022

Representation Code (set theory)

DOI: 10.48550/arxiv.2207.01622 Publication Date: 2022-01-01

Abstract Supplemental Material References Cited by

AUTHORS (16)

Kevin Qinghong Lin

Alex Jinpeng Wang

Mattia Soldan

Michael Wray

Rui Yan

Zhongcong Xu

Difei Gao

Rong-Cheng Tu

Wenzhe Zhao

Weijie Kong

Chengfei Cai

Hongfa Wang

Dima Damen

Bernard Ghanem

Wei Liu

Mike Zheng Shou

ABSTRACT

In this report, we propose a video-language pretraining (VLP) based solution \cite{kevin2022egovlp} for four Ego4D challenge tasks, including Natural Language Query (NLQ), Moment (MQ), Object State Change Classification (OSCC), and PNR Localization (PNR). Especially, exploit the recently released dataset \cite{grauman2021ego4d} to pioneer Egocentric VLP from dataset, objective, development set. Based on above three designs, develop pretrained model that is able transfer its egocentric video-text representation or video-only several video downstream tasks. Our achieves 10.46R@1&IoU @0.3 NLQ, 10.33 mAP MQ, 74% Acc OSCC, 0.67 sec error PNR. The code available at https://github.com/showlab/EgoVLP.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Egocentric Video-Language Pretraining @ Ego4D Challenge 2022

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....