NFDI4DS | UHH-SEMS - Publication Details

Jiajun Deng

ORCID: 0000-0001-9624-7451

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5089561793

Research Areas

Advanced Neural Network Applications
Advanced Image and Video Retrieval Techniques
Robotics and Sensor-Based Localization
Multimodal Machine Learning Applications
Domain Adaptation and Few-Shot Learning
2D Materials and Applications
Human Pose and Action Recognition
Advanced Vision and Imaging
Handwritten Text Recognition Techniques
Visual Attention and Saliency Detection
Graphene research and applications
3D Surveying and Cultural Heritage
Remote-Sensing Image Classification
Rock Mechanics and Modeling
MXene and MAX Phase Materials
Optical Wireless Communication Technologies
Advanced Image Processing Techniques
Image Processing Techniques and Applications
Digital Media Forensic Detection
Industrial Vision Systems and Defect Detection
Ga2O3 and related materials
Advanced Photocatalysis Techniques
Medical Image Segmentation Techniques
Advanced Optical Sensing Technologies
Grouting, Rheology, and Soil Mechanics

Sun Yat-sen University
2025

University of Science and Technology of China
2018-2024

Tongji University
2022-2024

Australian Centre for Robotic Vision
2023-2024

The University of Adelaide
2023-2024

National University of Defense Technology
2024

North China Electric Power University
2018-2024

The University of Sydney
2022-2024

Guizhou University
2024

Shanghai University
2024

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

OPENALEX - Publications

Jiajun Deng Shaoshuai Shi Peiwei Li Wengang Zhou Yanyong Zhang and 1 more

Recent advances on 3D object detection heavily rely how the data are represented, i.e., voxel-based or point-based representation. Many existing high performance detectors because this structure can better retain precise point positions. Nevertheless, point-level features lead to computation overheads due unordered storage. In contrast, is suited for feature extraction but often yields lower accuracy input divided into grids. paper, we take a slightly different viewpoint --- find that...

10.1609/aaai.v35i2.16207 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

OPENALEX - Publications

Shaoshuai Shi Li Jiang Jiajun Deng Zhe Wang Chaoxu Guo and 3 more

Abstract 3D object detection is receiving increasing attention from both industry and academia thanks to its wide applications in various fields. In this paper, we propose Point-Voxel Region-based Convolution Neural Networks (PV-RCNNs) for on point clouds. First, a novel detector, PV-RCNN, which boosts the performance by deeply integrating feature learning of point-based set abstraction voxel-based sparse convolution through two steps, i.e. , voxel-to-keypoint scene encoding keypoint-to-grid...

10.1007/s11263-022-01710-9 article EN cc-by International Journal of Computer Vision 2022-11-24

TransVG: End-to-End Visual Grounding with Transformers

OPENALEX - Publications

Jiajun Deng Zhengyuan Yang Tianlang Chen Wengang Zhou Houqiang Li

In this paper, we present a neat yet effective transformer-based framework for visual grounding, namely TransVG, to address the task of grounding language query corresponding region onto an image. The state-of-the-art methods, including two-stage or one-stage ones, rely on complex module with manually-designed mechanisms perform reasoning and multi-modal fusion. However, involvement certain in fusion design, such as decomposition image scene graph, makes models easily overfit datasets...

10.1109/iccv48922.2021.00179 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Relation Distillation Networks for Video Object Detection

OPENALEX - Publications

Jiajun Deng Yingwei Pan Ting Yao Wengang Zhou Houqiang Li and 1 more

It has been well recognized that modeling object-to-object relations would be helpful for object detection. Nevertheless, the problem is not trivial especially when exploring interactions between objects to boost video detectors. The difficulty originates from aspect reliable in a should depend on only present frame but also all supportive extracted over long range span of video. In this paper, we introduce new design capture across spatio-temporal context. Specifically, Relation...

10.1109/iccv.2019.00712 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

VPFNet: Improving 3D Object Detection With Virtual Point Based LiDAR and Stereo Data Fusion

OPENALEX - Publications

Hanqi Zhu Jiajun Deng Yu Zhang Jianmin Ji Qiuyu Mao and 2 more

It has been well recognized that fusing the complementary information from depth-aware LiDAR point clouds and semantic-rich stereo images would benefit 3D object detection. Nevertheless, it is non-trivial to explore inherently unnatural interaction between sparse points dense 2D pixels. To ease this difficulty, recent approaches generally project onto image plane sample data then aggregate at points. However, these often suffer mismatch resolution of RGB images, leading sub-optimal...

10.1109/tmm.2022.3189778 article EN IEEE Transactions on Multimedia 2022-07-11

Multi-Modal 3D Object Detection in Autonomous Driving: A Survey

OPENALEX - Publications

Yingjie Wang Qiuyu Mao Hanqi Zhu Jiajun Deng Yu Zhang and 3 more

10.1007/s11263-023-01784-z article EN International Journal of Computer Vision 2023-05-17

Effect of diagenetic variation on the static and dynamic mechanical behavior of coral reef limestone

OPENALEX - Publications

Linjian Ma Jiajun Deng Mingyang Wang Jianping Wang Bin Fang and 1 more

Coral reef limestone at different depositional depths and facies differ remarkably on the textural mineralogical characteristics, owing to complex sedimentary diagenesis. To explore effects of pore structure mineral composition associated with diagenetic variation mechanical behavior limestone, a series quasi-static dynamic compression tests along microscopic examinations were performed shallow deep burial depths. It is revealed that (SRL) classified as porous aragonite-type carbonate rock...

10.1016/j.ijmst.2024.07.004 article EN cc-by-nc-nd International Journal of Mining Science and Technology 2024-07-01

From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection

OPENALEX - Publications

Jiajun Deng Wengang Zhou Yanyong Zhang Houqiang Li

As an emerging data modal with precise distance sensing, LiDAR point clouds have been placed great expectations on 3D scene understanding. However, are always sparsely distributed in the space, and unstructured storage, which makes it difficult to represent them for effective object detection. To this end, work, we regard as hollow-3D propose a new architecture, namely Hallucinated Hollow-3D R-CNN (H <sup xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/tcsvt.2021.3100848 article EN IEEE Transactions on Circuits and Systems for Video Technology 2021-07-28

Weakly Supervised Temporal Adjacent Network for Language Grounding

OPENALEX - Publications

Yuechen Wang Jiajun Deng Wengang Zhou Houqiang Li

Temporal language grounding (TLG) is a fundamental and challenging problem for vision understanding. Existing methods mainly focus on fully supervised setting with temporal boundary labels training, which, however, suffers expensive cost of annotation. In this work, we are dedicated to weakly TLG, where multiple description sentences given an untrimmed video without labels. task, it critical learn strong cross-modal semantic alignment between sentence semantics visual content. To end,...

10.1109/tmm.2021.3096087 article EN IEEE Transactions on Multimedia 2021-08-24

TransVG++: End-to-End Visual Grounding With Language Conditioned Vision Transformer

OPENALEX - Publications

Jiajun Deng Zhengyuan Yang Daqing Liu Tianlang Chen Wengang Zhou and 3 more

In this work, we explore neat yet effective Transformer-based frameworks for visual grounding. The previous methods generally address the core problem of grounding, i.e., multi-modal fusion and reasoning, with manually-designed mechanisms. Such heuristic designs are not only complicated but also make models easily overfit specific data distributions. To avoid this, first propose TransVG, which establishes correspondences by Transformers localizes referred regions directly regressing box...

10.1109/tpami.2023.3296823 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-07-19

Single Shot Video Object Detector

OPENALEX - Publications

Jiajun Deng Yingwei Pan Ting Yao Wengang Zhou Houqiang Li and 1 more

Single shot detectors that are potentially faster and simpler than two-stage tend to be more applicable object detection in videos. Nevertheless, the extension of such from image video is not trivial especially when appearance deterioration exists videos, e.g., motion blur or occlusion. A valid question how explore temporal coherence across frames for boosting detection. In this paper, we propose address problem by enhancing per-frame features through aggregation neighboring frames....

10.1109/tmm.2020.2990070 article EN IEEE Transactions on Multimedia 2020-04-23

Shape optimization of underwater glider based on approximate model technology

OPENALEX - Publications

Ming Yang Yanhui Wang Shaoqiong Yang Lianhong Zhang Jiajun Deng

10.1016/j.apor.2021.102580 article EN Applied Ocean Research 2021-03-12

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction

OPENALEX - Publications

Hao Feng Yuechen Wang Wengang Zhou Jiajun Deng Houqiang Li

In this work, we propose a new framework, called Document Image Transformer (DocTr), to address the issue of geometry and illumination distortion document images. Specifically, DocTr consists geometric unwarping transformer an correction transformer. By setting set learned query embedding, captures global context image by self-attention mechanism decodes pixel-wise displacement solution correct distortion. After unwarping, our further removes shading artifacts improve visual quality OCR...

10.1145/3474085.3475388 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

Multidisciplinary design optimization of underwater glider for improving endurance

OPENALEX - Publications

Shuxin Wang Ming Yang Wendong Niu Yanhui Wang Shaoqiong Yang and 2 more

10.1007/s00158-021-02844-z article EN Structural and Multidisciplinary Optimization 2021-02-25

Masked Contrastive Representation Learning for Reinforcement Learning

OPENALEX - Publications

Jinhua Zhu Yingce Xia Lijun Wu Jiajun Deng Wengang Zhou and 3 more

In pixel-based reinforcement learning (RL), the states are raw video frames, which mapped into hidden representation before feeding to a policy network. To improve sample efficiency of state learning, recently, most prominent work is based on contrastive unsupervised representation. Witnessing that consecutive frames in game highly correlated, further data efficiency, we propose new algorithm, i.e., masked for RL (M-CURL), takes correlation among inputs consideration. our architecture,...

10.1109/tpami.2022.3176413 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-01-01

Preparation and optoelectronic performance of two-dimensional MoSe2/WSe2 lateral and vertical heterostructures

OPENALEX - Publications

Yutong Wang Jing Xue Lijian Bai Dong Pan Wenjie Wang and 5 more

10.1016/j.mtphys.2024.101404 article EN Materials Today Physics 2024-03-21

Instance Mining with Class Feature Banks for Weakly Supervised Object Detection

OPENALEX - Publications

Yufei Yin Jiajun Deng Wengang Zhou Houqiang Li

Recent progress on weakly supervised object detection (WSOD) is characterized by formulating WSOD as a Multiple Instance Learning (MIL) problem and taking online refinement with the selected region proposals from MIL. However, MIL inclines to select most discriminative part rather than entire instance top-scoring proposals, which leads weak localization capability for detectors. We attribute this limited intra-class diversity within single image. Specifically, due lack of annotated bounding...

10.1609/aaai.v35i4.16429 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

OPENALEX - Publications

Yingjie Wang Jiajun Deng Yao Li Jinshui Hu Cong Liu and 4 more

LiDAR and Radar are two complementary sensing approaches in that specializes capturing an object's 3D shape while provides longer detection ranges as well velocity hints. Though seemingly natural, how to efficiently combine them for improved feature representation is still unclear. The main challenge arises from data extremely sparse lack height information. Therefore, directly integrating features into LiDAR-centric networks not optimal. In this work, we introduce a bi-directional...

10.1109/cvpr52729.2023.01287 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

${\mathsf{EZFusion}}$: A Close Look at the Integration of LiDAR, Millimeter-Wave Radar, and Camera for Accurate 3D Object Detection and Tracking

OPENALEX - Publications

Yao Li Jiajun Deng Yu Zhang Jianmin Ji Houqiang Li and 1 more

A recent trend is to combine multiple sensors ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i.e.</i> , cameras, LiDARs and millimeter-wave Radars) achieve robust multi-modal perception for autonomous systems such as self-driving vehicles. Although quite a few sensor fusion algorithms have been proposed, some of which are top-ranked on various leaderboards, systematic study how integrate these three types develop effective 3D object...

10.1109/lra.2022.3193465 article EN IEEE Robotics and Automation Letters 2022-07-25

Tuning the Electronic and Photoelectronic Properties of Two-Dimensional MoSe2 Thin Films by In Situ V-Doping

OPENALEX - Publications

Caimei Li Yutong Wang Heng Yang Wenjie Wang Fangchao Lu and 3 more

Atom-substituting doping by atmospheric-pressure chemical vapor deposition (AP-CVD) is an effective and promising strategy for changing the properties of two-dimensional transition-metal dichalcogenides (2D TMDs). In this paper, we successfully grew V-doped MoSe2 films. The photoluminescence (PL) spectra gradually red-shifted with increase concentration, X-ray photoelectron spectroscopy (XPS) after shifted toward a lower binding energy, change polarity before can be seen in transfer...

10.1021/acs.jpcc.3c06829 article EN The Journal of Physical Chemistry C 2024-01-11

Bidirectional rectifier with gate voltage control based on Bi2O2Se/WSe2 heterojunction

OPENALEX - Publications

Ruonan Li Fangchao Lu Jiajun Deng Xingqiu Fu Wenjie Wang and 1 more

Abstract Two-dimensional (2D) WSe 2 has received increasing attention due to its unique optical properties and bipolar behavior. Several -based heterojunctions exhibit bidirectional rectification characteristics, but most devices have a lower ratio. In this work, the Bi O Se/WSe heterojunction prepared by us type Ⅱ band alignment, which can vastly suppress channel current through interface barrier so that device large ratio of about 10 5 . Meanwhile, under different gate voltage modulation,...

10.1088/1674-4926/45/1/012701 article EN Journal of Semiconductors 2024-01-01

Preparation of β-Ga2O3/ε-Ga2O3 type II phase junctions by atmospheric pressure chemical vapor deposition

OPENALEX - Publications

Xianxu Li Jiale Niu Lijian Bai Xue Jing Dongwen Gao and 3 more

10.1016/j.ceramint.2024.12.472 article EN Ceramics International 2025-01-01

3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer

OPENALEX - Publications

Jiajun Deng Tianyu He Li Jiang Tianyu Wang Feras Dayoub and 1 more

Current 3D Large Multimodal Models (3D LMMs) have shown tremendous potential in 3D-vision-based dialogue and reasoning. However, how to further enhance LMMs achieve fine-grained scene understanding facilitate flexible human-agent interaction remains a challenging problem. In this work, we introduce 3D-LLaVA, simple yet highly powerful LMM designed act as an intelligent assistant comprehending, reasoning, interacting with the world. Unlike existing top-performing methods that rely on...

10.48550/arxiv.2501.01163 preprint EN arXiv (Cornell University) 2025-01-02

Coming Soon ...