- Face recognition and analysis
- Face and Expression Recognition
- Image Retrieval and Classification Techniques
- Aesthetic Perception and Analysis
- Domain Adaptation and Few-Shot Learning
- Vehicle License Plate Recognition
- Video Analysis and Summarization
- Generative Adversarial Networks and Image Synthesis
- Speech and Audio Processing
- 3D Surveying and Cultural Heritage
- Digital Humanities and Scholarship
- Handwritten Text Recognition Techniques
- Industrial Vision Systems and Defect Detection
- Advanced Optical Sensing Technologies
- Robotics and Sensor-Based Localization
- Image and Object Detection Techniques
- Advanced Image and Video Retrieval Techniques
- Adversarial Robustness in Machine Learning
- Advanced Manufacturing and Logistics Optimization
- Reinforcement Learning in Robotics
- 3D Shape Modeling and Analysis
- Multimodal Machine Learning Applications
- Video Surveillance and Tracking Methods
Southwest Jiaotong University
2024
Alibaba Group (China)
2021-2023
Xiamen University
2019-2021
Face parsing computes pixel-wise label maps for different semantic components (e.g., hair, mouth, eyes) from face images. Existing literature have illustrated significant advantages by focusing on individual regions of interest (RoIs) faces and facial components. However,the traditional crop-and-resize mechanism ignores all contextual area outside the RoIs, thus is not suitable when component unpredictable, e.g. hair. Inspired physiological vision system human, we propose a novel RoI...
Advertising posters, a form of information presentation, combine visual and linguistic modalities. Creating poster involves multiple steps necessitates design experience creativity. This paper introduces AutoPoster, highly automatic content-aware system for generating advertising posters. With only product images titles as inputs, AutoPoster can automatically produce posters varying sizes through four key stages: image cleaning retargeting, layout generation, tagline style attribute...
With the rapid development of artificial intelligence and fifth-generation mobile network technologies, automatic instrument reading has become an increasingly important topic for intelligent sensors in smart cities. We propose a full pipeline to automatically read watermeters based on single image, using deep learning methods provide new technical support water meter reading. To handle various challenging environments where reside, our disentangled task into individual subtasks structures...
Text design is one of the most critical procedures in poster design, as it relies heavily on creativity and expertise humans to text images considering visual harmony text-semantic. This study introduces TextPainter, a novel multimodal approach that leverages contextual information corresponding semantics generate images. Specifically, TextPainter takes global-local background image hint style guides generation with harmony. Furthermore, we leverage language model introduce comprehension...
Face parsing computes pixel-wise label maps for different semantic components (e.g., hair, mouth, eyes) from face images. Existing literature have illustrated significant advantages by focusing on individual regions of interest (RoIs) faces and facial components. However, the traditional crop-and-resize mechanism ignores all contextual area outside RoIs, thus is not suitable when component unpredictable, e.g. hair. Inspired physiological vision system human, we propose a novel RoI...
Recognizing 3D point cloud plays a pivotal role in many real-world applications. However, deploying deep learning model is vulnerable to adversarial attacks. Despite efforts into developing robust by training, they may become less effective against emerging This limitation motivates the development of purification which employs generative mitigate impact In this work, we highlight remaining challenges from two perspectives. First, based method requires retraining classifier on purified...
Video conferencing is an essential way for contactless conversation, which conveys abundant multimedia signals. Especially under COVID-19, the video conference has been becoming a common daily communications. However, sake of plague prevention, it usually happens that people attending are wearing mouth mask, leading to inconvenient communication due incomplete facial information. To tackle this problem, we develop novel system reveals masked faces in real-time, making each participant feel...
Advertising posters, a form of information presentation, combine visual and linguistic modalities. Creating poster involves multiple steps necessitates design experience creativity. This paper introduces AutoPoster, highly automatic content-aware system for generating advertising posters. With only product images titles as inputs, AutoPoster can automatically produce posters varying sizes through four key stages: image cleaning retargeting, layout generation, tagline style attribute...
The facial caricature shows the distinct characteristics of a person via exaggerations both shape and appearance. This paper presents novel framework that automatically generates vivid caricatures by encoding personalized semantic information. To this end, we first design part-based scheme for geometry warping, which composes local deformation into global warping field, equipped with sufficient freedom different components. Second, under Part-based Warping, photo-to-caricature translation...