NFDI4DS | UHH-SEMS - Publication Details

Qimeng Wang

ORCID: 0000-0002-9715-836X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5083340374

Research Areas

Advanced Neural Network Applications
Multimodal Machine Learning Applications
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Natural Language Processing Techniques
Industrial Vision Systems and Defect Detection
Topic Modeling
Dental Radiography and Imaging
Video Analysis and Summarization
Dental Research and COVID-19
Intelligent Tutoring Systems and Adaptive Learning
COVID-19 diagnosis using AI
AI in cancer detection
Image and Object Detection Techniques
Human Pose and Action Recognition
Non-Destructive Testing Techniques
Integrated Circuits and Semiconductor Failure Analysis
Virtual Reality Applications and Impacts
VLSI and Analog Circuit Testing
Visual Attention and Saliency Detection
Advanced Optical Imaging Technologies
Anomaly Detection Techniques and Applications
Image and Video Quality Assessment
Engineering and Test Systems
Advanced Data Compression Techniques

Jiangnan University
2024-2025

Huazhong University of Science and Technology
2020-2022

University of Science and Technology Beijing
2021

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

OPENALEX - Publications

Yongchao Xu Mingtao Fu Qimeng Wang Yukang Wang Kai Chen and 2 more

Object detection has recently experienced substantial progress. Yet, the widely adopted horizontal bounding box representation is not appropriate for ubiquitous oriented objects such as in aerial images and scene texts. In this paper, we propose a simple yet effective framework to detect multi-oriented objects. Instead of directly regressing four vertices, glide vertex on each corresponding side accurately describe object. Specifically, We regress length ratios characterizing relative...

10.1109/tpami.2020.2974745 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-02-18

End-to-End Temporal Action Detection With Transformer

OPENALEX - Publications

Xiaolong Liu Qimeng Wang Yao Hu Xu Tang Shiwei Zhang and 2 more

Temporal action detection (TAD) aims to determine the semantic label and temporal interval of every instance in an untrimmed video. It is a fundamental challenging task video understanding. Previous methods tackle this with complicated pipelines. They often need train multiple networks involve hand-designed operations, such as non-maximal suppression anchor generation, which limit flexibility prevent end-to-end learning. In paper, we propose Transformer-based method for TAD, termed TadTR....

10.1109/tip.2022.3195321 article EN IEEE Transactions on Image Processing 2022-01-01

TOMGPT: Reliable Text-Only Training Approach for Cost-Effective Multi-modal Large Language Model

OPENALEX - Publications

Yunkai Chen Qimeng Wang Shiwei Wu Yan Gao Tong Xu and 1 more

Multi-modal large language models (MLLMs), such as GPT-4, exhibit great comprehension capabilities on human instruction, well zero-shot ability new downstream multi-modal tasks. To integrate the different modalities within a unified embedding space, previous MLLMs attempted to conduct visual instruction tuning with massive and high-quality image-text pair data, which requires substantial costs in data collection training resources. In this article, we propose TOMGPT (Text-Only GPT),...

10.1145/3654674 article EN ACM Transactions on Knowledge Discovery from Data 2024-03-28

An Improved Feature Enhancement CenterNet Model for Small Object Defect Detection on Metal Surfaces

OPENALEX - Publications

Xingfei Zhu Qimeng Wang Bufan Zhang Zhaofei Sun Jinghu Yu and 1 more

Abstract In defect detection on metal surfaces, there are many small defects with subtle features that difficult to distinguish from the background environment using mainstream object methods. To alleviate this issue, study proposes an improved CenterNet model for enhancing of namely MSDD. work, we utilize attention mechanism reconstruct basic feature extraction module in network, aiming enhance focus related defects. Additionally, redesign efficient deconvolution extract multi‐scale...

10.1002/adts.202301230 article EN Advanced Theory and Simulations 2024-06-11

Optimized Yolov8 feature fusion algorithm for dental disease detection

OPENALEX - Publications

Qimeng Wang Xingfei Zhu Zhaofei Sun Bufan Zhang Jinghu Yu and 1 more

10.1016/j.compbiomed.2025.109778 article EN Computers in Biology and Medicine 2025-02-08

Real-time rendering super multiview display through a universal rendering engine

OPENALEX - Publications

Zong Qin Jiaqi Dong Yunfan Cheng Yifan Ding Qimeng Wang and 4 more

10.1117/12.3039612 article EN 2025-03-19

Decoupled IoU Regression for Object Detection

OPENALEX - Publications

Yan Gao Qimeng Wang Xu Tang Haochen Wang Fei Ding and 2 more

Non-maximum suppression (NMS) is widely used in object detection pipelines for removing duplicated bounding boxes. The inconsistency between the confidence NMS and real localization seriously affects performance. Prior works propose to predict Intersection-over-Union (IoU) boxes corresponding ground-truths improve NMS, while accurately predicting IoU still a challenging problem. We argue that complex definition of feature misalignment make it difficult accurately. In this paper, we novel...

10.1145/3474085.3475707 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation

OPENALEX - Publications

Shiwei Wu Joya Chen Kevin Qinghong Lin Qimeng Wang Yan Gao and 5 more

A well-known dilemma in large vision-language models (e.g., GPT-4, LLaVA) is that while increasing the number of vision tokens generally enhances visual understanding, it also significantly raises memory and computational costs, especially long-term, dense video frame streaming scenarios. Although learnable approaches like Q-Former Perceiver Resampler have been developed to reduce token burden, they overlook context causally modeled by LLMs (i.e., key-value cache), potentially leading missed...

10.48550/arxiv.2408.16730 preprint EN arXiv (Cornell University) 2024-08-29

Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents

OPENALEX - Publications

Shiwei Wu Chen Zhang Yan Gao Qimeng Wang Tong Xu and 2 more

Instructional documents are rich sources of knowledge for completing various tasks, yet their unique challenges in conversational question answering (CQA) have not been thoroughly explored. Existing benchmarks primarily focused on basic factual question-answering from single narrative documents, making them inadequate assessing a model`s ability to comprehend complex real-world instructional and provide accurate step-by-step guidance daily life. To bridge this gap, we present InsCoQA, novel...

10.48550/arxiv.2410.00526 preprint EN arXiv (Cornell University) 2024-10-01

Metal sensor base defects detection using deep learning based YOLO network

OPENALEX - Publications

Bufan Zhang Xingfei Zhu Jinghu Yu Zhaofei Sun Qimeng Wang

10.1007/s11760-024-03685-1 article EN Signal Image and Video Processing 2024-12-03

Universal Semiconductor ATPG Solutions for ATE Platform under the Trend of AI and ADAS

OPENALEX - Publications

Qimeng Wang Zhonghe. Tian He Xi Ziteng. Xu Mingjie. Tang and 2 more

This article introduces a universal semiconductor Automatic Test Pattern Generation (ATPG) solution for Automated Equipment (ATE) platform. With the increasing trend of Artificial Intelligence (AI) and Advanced Driving Assistance System (ADAS) communication between devices requires advanced protocols such as Mobile Industry Processor Interface (MIPI) Point-to-point (P2P) protocols. A designer-based is developed to provide one-click software approach create test vectors common customized As...

10.1109/cstic52283.2021.9461259 article EN 2022 China Semiconductor Technology International Conference (CSTIC) 2021-03-14

MVP-SEG: Multi-View Prompt Learning for Open-Vocabulary Semantic Segmentation

OPENALEX - Publications

Jie Guo Qimeng Wang Yan Gao Xiaolong Jiang Xu Tang and 2 more

CLIP (Contrastive Language-Image Pretraining) is well-developed for open-vocabulary zero-shot image-level recognition, while its applications in pixel-level tasks are less investigated, where most efforts directly adopt features without deliberative adaptations. In this work, we first demonstrate the necessity of image-pixel feature adaption, then provide Multi-View Prompt learning (MVP-SEG) as an effective solution to achieve adaptation and solve semantic segmentation. Concretely, MVP-SEG...

10.48550/arxiv.2304.06957 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Decoupled IoU Regression for Object Detection

OPENALEX - Publications

Yan Gao Qimeng Wang Xu Tang Haochen Wang Fei Ding and 2 more

10.48550/arxiv.2202.00866 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Coming Soon ...