- Advanced Image and Video Retrieval Techniques
- Visual Attention and Saliency Detection
- Advanced Neural Network Applications
- Image Enhancement Techniques
- Advanced Radiotherapy Techniques
- Robotic Locomotion and Control
- Video Analysis and Summarization
- Prosthetics and Rehabilitation Robotics
- Image Retrieval and Classification Techniques
- Smart Agriculture and AI
- Video Surveillance and Tracking Methods
- Winter Sports Injuries and Performance
- Medical Image Segmentation Techniques
- Radiomics and Machine Learning in Medical Imaging
- Human Motion and Animation
- Soil Mechanics and Vehicle Dynamics
- Advanced Vision and Imaging
- Color perception and design
- Computer Graphics and Visualization Techniques
- Advanced MRI Techniques and Applications
- Gaze Tracking and Assistive Technology
- Advanced Neuroimaging Techniques and Applications
- Advanced Image Processing Techniques
- Multimodal Machine Learning Applications
- Adversarial Robustness in Machine Learning
Beijing University of Posts and Telecommunications
2025
Zhejiang University of Technology
2024
ShanghaiTech University
2024
Guizhou University
2020-2024
Second Affiliated Hospital of Dalian Medical University
2024
Xidian University
2022-2023
Jiangsu University of Science and Technology
2023
Northeast Forestry University
2013-2023
Beijing Technology and Business University
2023
University of Kansas Medical Center
2023
Layout is fundamental to graphic designs. For visual attractiveness and efficient communication of messages ideas, design layouts often have great variation, driven by the contents be presented. In this paper, we study problem content-aware layout generation. We propose a deep generative model for that able synthesize designs based on textual semantics user inputs. Unlike previous approaches are oblivious input rely heuristic criteria, our captures effect layouts, implicitly learns complex...
Shadow detection is an important and challenging task for scene understanding. Despite promising results from recent deep learning based methods. Existing works still struggle with ambiguous cases where the visual appearances of shadow non-shadow regions are similar (referred to as distraction in our context). In this paper, we propose a Distraction-aware Detection Network (DSDNet) by explicitly integrating semantics end-to-end framework. At core framework novel standalone, differentiable...
Although huge progress has been made on scene analysis in recent years, most existing works assume the input images to be day-time with good lighting conditions. In this work, we aim address night-time parsing (NTSP) problem, which two main challenges: 1) labeled data are scarce, and 2) over- under-exposures may co-occur not explicitly modeled pipelines. To tackle scarcity of data, collect a novel dataset, named NightCity, 4,297 real ground truth pixel-level semantic annotations. our...
The accuracy, speed, and robustness of object detection recognition are directly related to the harvesting efficiency, quality, speed fruit vegetable robots. In order explore development status techniques for robots based on digital image processing traditional machine learning, this article summarizes analyzes some representative methods. This also demonstrates current challenges future potential developments. work aims provide a reference research learning.
While question answering (QA) with neural network, i.e. QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating system. To alleviate this problem, we propose human annotated real-world WebQA more than 42k questions 556k evidences. As existing methods resolve either as sequence generation or classification/ranking they face challenges expensive softmax computation, unseen answers handling separate...
We present a novel approach that allows web designers to easily direct user attention via visual flow on designs. By collecting and analyzing users' eye gaze data real-world webpages under the task-driven condition, we build two models characterize patterns between pair of page components. These enable design interaction for create guide eyes (i.e., along given path) through with minimal effort. In particular, an existing as well designer-specified path over subset components, our...
In this paper, we propose a novel form of weak supervision for salient object detection (SOD) based on saliency bounding boxes, which are minimum rectangular boxes enclosing the objects. Based idea, weakly-supervised SOD method, by predicting pixel-level pseudo ground truth maps from just boxes. Our method first takes advantage unsupervised methods to generate initial and addresses over/under prediction problems, obtain maps. We then iteratively refine learning multi-task map refinement...
Manga layout is a core component in manga production, characterized by its unique styles. However, stylistic layouts are difficult for novices to produce as it requires hands-on experience and domain knowledge. In this paper, we propose an approach automatically generate from set of input artworks with user-specified semantics, thus allowing less-experienced users create high-quality minimal efforts. We first introduce three parametric style models that encode the aspects layouts, including...
Automatically extracting frames/panels from digital comic pages is crucial for techniques that facilitate reading on mobile devices with limited display areas. However, automatic panel extraction manga, i.e., Japanese comics, can be especially challenging, largely because of its complex layout design mixed various visual symbols throughout the page. In this paper, we propose a robust method automatically panels manga pages. Our first extracts block by closing open and identifying page...
Graphic designers often manipulate the overall look and feel of their designs to convey certain personalities (e.g., cute, mysterious romantic) impress potential audiences achieve business goals. However, understanding factors that determine personality a design is challenging, as graphic result thousands decisions on numerous factors, such font, color, image, layout. In this paper, we aim answer question what characterizes design. To end, propose deep learning framework for exploring...
Picture subjects and text balloons are basic elements in comics, working together to propel the story forward. Japanese comics artists often leverage a carefully designed composition of (generally referred as panel ) provide continuous fluid reading experience. However, such is hard produce for people without required experience knowledge. In this paper, we propose an approach novices synthesize that can effectively guide reader's attention convey story. Our primary contribution...
In this paper, we present the development of a visual navigation capability for small drone enabling it to autonomously approach flowers. This is very important step towards fully autonomous flower pollinating nanodrone. The developed totally and relies its on on-board color camera, complemented with one simple ToF distance sensor, detect flower. proposed solution uses DJI Tello carrying Maix Bit processing board capable running all deep-learning-based image algorithms on-board. We two-stage...
Abstract Web designers often carefully select fonts to fit the context of a web design make look aesthetically pleasing and effective in communication. However, selecting proper for is tedious time‐consuming task, as each font has many properties, such face, color, size, resulting very large search space. In this paper, we aim model context, by studying novel challenging problem predicting that match given design. To end, propose novel, multi‐task deep neural network jointly predict color...
When adding a photo onto graphic design, professional designers often adjust its colors based on some target obtained from the brand or product to make entire design more memorable audiences and establish consistent identity. However, adjusting of in context is difficult task, with two major challenges: (1) Locality: The color adjusted locally preserve semantics atmosphere original image; (2) Naturalness: modified region needs be carefully chosen recolored obtain semantically valid visually...
Teleoperation systems find many applications from earlier search-and-rescue to more recent daily tasks. It is widely acknowledged that using external sensors can decouple the view of remote scene motion robot arm during manipulation, facilitating control task. However, this design requires coordination multiple operators or may exhaust a single operator as s/he needs both manipulator and sensors. To address challenge, our work introduces viewpoint prediction model, first data-driven approach...
We propose a method for animating still manga imagery through camera movements. Given series of existing pages, we start by automatically extracting panels, comic characters, and balloons from the pages. Then, use data-driven graphical model to infer per-panel motion emotion states low-level visual patterns. Finally, combining domain knowledge film production characteristics manga, simulate movements over yielding an animation. The results augment contents with animated that reveals mood...
In this article, we propose a fully automatic system for generating comic books from videos without any human intervention. Given an input video along with its subtitles, our approach first extracts informative keyframes by analyzing the subtitles and stylizes into comic-style images. Then, novel multi-page layout framework that can allocate images across multiple pages synthesize visually interesting layouts based on rich semantics of (e.g., importance inter-image relation). Finally, as...