- Multimodal Machine Learning Applications
- Video Analysis and Summarization
- Robotics and Sensor-Based Localization
- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Remote-Sensing Image Classification
- Generative Adversarial Networks and Image Synthesis
- Advanced Vision and Imaging
- Software Engineering Research
- Medical Imaging Techniques and Applications
- Brain Tumor Detection and Classification
- Bone and Joint Diseases
- Topic Modeling
- Advanced MRI Techniques and Applications
- Optical measurement and interference techniques
- Radiomics and Machine Learning in Medical Imaging
- Infrastructure Maintenance and Monitoring
- Software Testing and Debugging Techniques
- Machine Learning in Healthcare
- BIM and Construction Integration
- Spine and Intervertebral Disc Pathology
- Business Process Modeling and Analysis
- Data Quality and Management
- MRI in cancer diagnosis
- Human Motion and Animation
Huazhong University of Science and Technology
2019-2025
Tongji Hospital
2019-2025
Chinese Academy of Medical Sciences & Peking Union Medical College
2025
State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
2025
Wuhan University
2025
Technical University of Munich
2022-2025
Binzhou University
2025
Binzhou Medical University
2025
Beijing University of Technology
2024
Dalian Maritime University
2024
In this work, we propose a new task called Story Visualization. Given multi-sentence paragraph, the story is visualized by generating sequence of images, one for each sentence. contrast to video generation, visualization focuses less on continuity in generated images (frames), but more global consistency across dynamic scenes and characters -- challenge that has not been addressed any single-image or generation methods. Therefore, story-to-image-sequence model, StoryGAN, based sequential...
Event-specific concepts are the semantic specifically designed for events of interest, which can be used as a mid-level representation complex in videos. Existing methods only focus on defining event-specific small number pre-defined events, but cannot handle novel unseen events. This motivates us to build large scale concept library that covers many real-world and their possible. Specifically, we choose WikiHow, an online forum containing how-to articles human daily life We perform...
Generating videos from text has proven to be a significant challenge for existing generative models. We tackle this problem by training conditional model extract both static and dynamic information text. This is manifested in hybrid framework, employing Variational Autoencoder (VAE) Generative Adversarial Network (GAN). The features, called "gist," are used sketch text-conditioned background color object layout structure. Dynamic features considered transforming input into an image filter....
Input constraints are useful for many software development tasks. For example, input of a function enable the generation valid inputs, i.e., inputs that follow these constraints, to test deeper. API functions deep learning (DL) libraries have DL specific which described informally in free form documentation. Existing constraint extraction techniques ineffective extracting constraints. To fill this gap, we design and implement new technique, DocTer, analyze documentation extract functions....
Most existing text-to-image synthesis tasks are static single-turn generation, based on pre-defined textual descriptions of images. To explore more practical and interactive real-life applications, we introduce a new task - Interactive Image Editing, where users can guide an agent to edit images via multi-turn commands on-the-fly. In each session, the takes natural language description from user as input, modifies image generated in previous turn design, following description. The main...
Object pose estimation is crucial for robotic applications and augmented reality. Beyond instance level 6D object methods, estimating category-level shape has become a promising trend. As such, new research field needs to be supported by well-designed datasets. To provide benchmark with high-quality ground truth annotations the community, we introduce multimodal dataset photometrically challenging objects termed PhoCaL. PhoCaL comprises 60 high quality 3D models of household over 8...
Despite recent advances in medical image generation, existing methods struggle to produce anatomically plausible 3D structures. In synthetic brain magnetic resonance images (MRIs), characteristic fissures are often missing, and reconstructed cortical surfaces appear scattered rather than densely convoluted. To address this issue, we introduce Cor2Vox, the first diffusion model-based method that translates continuous shape priors MRIs. achieve this, leverage a Brownian bridge process which...
Learning-based methods to solve dense 3D vision problems typically train on sensor data. The respectively used principle of measuring distances provides advantages and drawbacks. These are not compared nor discussed in the literature due a lack multi-modal datasets. Texture-less regions problematic for structure from motion stereo, reflective material poses issues active sensing, translucent objects intricate measure with existing hardware. Training inaccurate or corrupt data induces model...
Abstract The advantages of microstrip patch antennas include small size, adaptable surface, ease fabrication, and compatibility with integrated circuit technology. Numerous experiments have been done over the past few decades to enhance performance this antenna, both military commercial sectors found many uses for it. This paper introduces a antenna an operating frequency 28GHz 5G mobile communication. research designed simulated rectangular 3.494 mm * 5.3 0.003 mm. proposed resonates at 28...
Event-specific concepts are the semantic designed for events of interest, which can be used as a mid-level representation complex in videos. Existing methods only focus on defining event-specific small number predefined events, but cannot handle novel unseen events. This motivates us to build large scale concept library that covers many real-world and their possible. Specifically, we choose WikiHow, an online forum containing how-to articles human daily life We perform coarse-to-fine event...