NFDI4DS | UHH-SEMS - Publication Details

Robin Courant

ORCID: 0009-0002-5329-4009

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5063815567

Research Areas

Video Analysis and Summarization
Multimodal Machine Learning Applications
Human Motion and Animation
Advanced Vision and Imaging
Humor Studies and Applications
Human Pose and Action Recognition
Music and Audio Processing
Subtitles and Audiovisual Media
Brain Tumor Detection and Classification
Reinforcement Learning in Robotics
Advanced Optical Imaging Technologies
3D Shape Modeling and Analysis
Augmented Reality Applications
Computer Graphics and Visualization Techniques
Cell Image Analysis Techniques
Robotic Path Planning Algorithms
Video Coding and Compression Technologies
COVID-19 diagnosis using AI
Evacuation and Crowd Dynamics
Advanced Image Processing Techniques
Hand Gesture Recognition Systems
Advanced Image and Video Retrieval Techniques
Robot Manipulation and Learning

École Polytechnique
2021-2024

Laboratoire d'Informatique de l'École Polytechnique
2021-2024

Université de Rennes
2022-2023

Institut de Recherche en Informatique et Systèmes Aléatoires
2022-2023

Centre National de la Recherche Scientifique
2021-2023

Institut national de recherche en informatique et en automatique
2022-2023

FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild

OPENALEX - Publications

Zhi-Song Liu Robin Courant Vicky Kalogeiton

Abstract Automatically understanding funny moments (i.e., the that make people laugh) when watching comedy is challenging, as they relate to various features, such body language, dialogues and culture. In this paper, we propose FunnyNet-W, a model relies on cross- self-attention for visual, audio text data predict in videos. Unlike most methods rely ground truth form of subtitles, work exploit modalities come naturally with videos: (a) video frames contain visual information indispensable...

10.1007/s11263-024-02000-2 article EN cc-by International Journal of Computer Vision 2024-02-23

JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields

OPENALEX - Publications

Xi Wang Robin Courant Jinglei Shi Éric Marchand Marc Christie

This paper presents JAWS, an optimization-driven approach that achieves the robust transfer of visual cinematic features from a reference in-the-wild video clip to newly generated clip. To this end, we rely on implicit-neural-representation (INR) in way compute shares same as We propose general formulation camera optimization problem INR computes extrinsic and intrinsic parameters well timing. By leveraging differentiability neural representations, can back-propagate our designed losses...

10.1109/cvpr52729.2023.01624 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild

OPENALEX - Publications

Zhi-Song Liu Robin Courant Vicky Kalogeiton

Automatically understanding funny moments (i.e., the that make people laugh) when watching comedy is challenging, as they relate to various features, such body language, dialogues and culture. In this paper, we propose FunnyNet-W, a model relies on cross- self-attention for visual, audio text data predict in videos. Unlike most methods rely ground truth form of subtitles, work exploit modalities come naturally with videos: (a) video frames contain visual information indispensable scene...

10.48550/arxiv.2401.04210 preprint EN cc-by arXiv (Cornell University) 2024-01-01

E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness

OPENALEX - Publications

Robin Courant Nicolas Dufour Xi Wang Marc Christie Vicky Kalogeiton

Stories and emotions in movies emerge through the effect of well-thought-out directing decisions, particular camera placement movement over time. Crafting compelling trajectories remains a complex iterative process, even for skilful artists. To tackle this, this paper, we propose dataset called Exceptional Trajectories (E.T.) with along character information textual captions encompassing descriptions both character. our knowledge, is first its kind. show potential applications E.T. dataset,...

10.48550/arxiv.2407.01516 preprint EN arXiv (Cornell University) 2024-07-01

AKiRa: Augmentation Kit on Rays for optical video generation

OPENALEX - Publications

Yinglin Wang Robin Courant Marc Christie Vicky Kalogeiton

Recent advances in text-conditioned video diffusion have greatly improved quality. However, these methods offer limited or sometimes no control to users on camera aspects, including dynamic motion, zoom, distorted lens and focus shifts. These motion optical aspects are crucial for adding controllability cinematic elements generation frameworks, ultimately resulting visual content that draws focus, enhances mood, guides emotions according filmmakers' controls. In this paper, we aim close the...

10.48550/arxiv.2412.14158 preprint EN arXiv (Cornell University) 2024-12-18

Smart Motion Trails for Animating in VR

OPENALEX - Publications

Jean-Baptiste Bordier Anthony Mirabile Robin Courant Marc Christie

The artistic crafting of 3D animations by designers is a complex and iterative process. While classical animation tools have brought significant improvements in creating manipulating shapes over time, most approaches rely on 2D input devices to create contents. With the advent virtual reality technologies their ability dive users worlds precisely track 6 dimensions (position orientation), number VR creative emerged such as Quill, AnimVR, Tvori, Tiltbrush or MasterPieceVR. these provide...

10.1109/aivr56993.2022.00016 preprint EN 2022-12-01

Machine Learning for Brain Disorders: Transformers and Visual Transformers

OPENALEX - Publications

Robin Courant Maika Edberg Nicolas Dufour Vicky Kalogeiton

Transformers were initially introduced for natural language processing (NLP) tasks, but fast they adopted by most deep learning fields, including computer vision. They measure the relationships between pairs of input tokens (words in case text strings, parts images visual Transformers), termed attention. The cost is exponential with number tokens. For image classification, common Transformer Architecture uses only Encoder order to transform various However, there are also numerous other...

10.48550/arxiv.2303.12068 preprint EN cc-by arXiv (Cornell University) 2023-01-01

BluNF: Blueprint Neural Field

OPENALEX - Publications

Robin Courant Xi Wang Marc Christie Vicky Kalogeiton

Neural Radiance Fields (NeRFs) have revolutionized scene novel view synthesis, offering visually realistic, precise, and robust implicit reconstructions. While recent approaches enable NeRF editing, such as object removal, 3D shape modification, or material property manipulation, the manual annotation prior to edits makes process tedious. Additionally, traditional 2D interaction tools lack an accurate sense of space, preventing precise manipulation editing scenes. In this paper, we introduce...

10.48550/arxiv.2309.03933 preprint EN other-oa arXiv (Cornell University) 2023-01-01

BluNF: Blueprint Neural Field

OPENALEX - Publications

Robin Courant Xi Wang Marc Christie Vicky Kalogeiton

10.1109/iccvw60793.2023.00310 preprint EN 2023-10-02

Coming Soon ...