- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Image and Video Retrieval Techniques
- Persona Design and Applications
- Innovative Human-Technology Interaction
- Data Visualization and Analytics
- Handwritten Text Recognition Techniques
- Vehicle License Plate Recognition
- Advanced Neural Network Applications
- Video Analysis and Summarization
- Social Robot Interaction and HRI
- Image Retrieval and Classification Techniques
- Color perception and design
- Gaze Tracking and Assistive Technology
- Face recognition and analysis
- Metaheuristic Optimization Algorithms Research
- Cancer-related molecular mechanisms research
- Human Mobility and Location-Based Analysis
- Digital Media and Visual Art
- Advanced Graph Neural Networks
- Design Education and Practice
- Aesthetic Perception and Analysis
- Cultural Heritage Management and Preservation
- Industrial Vision Systems and Defect Detection
- Topic Modeling
University of Science and Technology Beijing
2016-2025
Abstract In this work, ultra-low dilution rate Inconel 625 coatings with a thickness of ~534.4 μm were prepared by high-speed laser cladding technique on the Q245R steel. The XRD and TEM results show that are mainly composed Nb Mo-enriched laves phase hexagonal close-packed (HCP) structure γ-Ni face-centered-cubic (FCC) structure. cellular crystal, column equiaxed dendritic crystal observed in bottom, middle, top from SEM results, respectively. wear resistance corrosion steel significantly...
Although existing image deep learning super-resolution (SR) methods achieve promising performance on benchmark datasets, they still suffer from severe drops when the degradation of low-resolution (LR) input is not covered in training. To address problem, we propose an innovative unsupervised method Learning Correction Filter via Degradation-Adaptive Regression for Blind Single Image Super-Resolution. Highly inspired by generalized sampling theory, our aims to enhance strength off-the-shelf...
Text-based visual question answering (VQA) requires to read and understand text in an image correctly answer a given question. However, most current methods simply add optical character recognition (OCR) tokens extracted from the into VQA model without considering contextual information of OCR mining relationships between scene objects. In this paper, we propose novel text-centered method called RUArt (Reading, Understanding Answering Related Text) for text-based VQA. Taking as input, first...
Temporal Knowledge Graph (TKG) reasoning often involves completing missing factual elements along the timeline. Although existing methods can learn good embeddings for each element in quadruples by integrating temporal information, they fail to infer evolution of facts. This is mainly because (1) insufficiently exploring internal structure and semantic relationships within individual (2) inadequately learning a unified representation contextual correlations among different quadruples. To...
Learning a common latent embedding by aligning the spaces of cross-modal autoencoders is an effective strategy for Generalized Zero-Shot Classification (GZSC). However, due to lack fine-grained instance-wise annotations, it still easily suffer from domain shift problem discrepancy between visual representation diversified images and semantic fixed attributes. In this paper, we propose innovative autoencoder network learning Aligned Cross-Modal Representations (dubbed ACMR) GZSC....
Benefiting from attention mechanisms, query-based detectors have a strong model capacity. They predict classification and regression by utilizing their shared queries features in the decoder. Inter-task biases cause multi-directional gradients that disturb each other to limit optimization. In this work, we introduce an decoupling (AD) for explicitly align multi-task features. Specifically, AD consists of Dense-to-Sparse Query Generator (DSQG) Split Cross-Attention (SCA), enabling query...
With the development of Information and Communications Technology (ICT), smart TV is gradually becoming universal penetrating daily life. Smart has a wide range user groups, elderly an important group. The special physiological psychological characteristics Chinese highlight usability problems interactions for them. main functions TVs were selected operated by in this study. help measurements, behaviour analyses interviews during natural usage scenarios, we determined issues requirements...
Multi-scale object detection in natural scenes is still challenging. To enhance the multi-scale perception capability, some algorithms combine lower-level and higher-level information via feature fusion strategies. However, inherent spatial properties among instances relations between foreground background are ignored. In addition, human-defined "center-based" regression quality evaluation strategy, predicting a high-to-low score based on linear relationship with distance to center of...
Genetic programming (GP) has shown promising results in interpretable feature extraction, but few works considered both classification accuracy and data visualization as objectives. Evaluating the extracted features based on combination of measures can help to achieve two objectives simultaneously. However, exploitation improper methods will decrease accuracy. In this paper, a novel extraction method GP non-overlap degree is proposed extract for high visualization. And function that...
This paper aims at responding to the questions in traditional cultural heritage protection and exploring new visual representation forms human-computer natural interaction design methods carry forward digital heritage(DCH) projects. combined with Chinese shadow puppet art create a media artwork based on computer image capturing technology, order actively reveal relationship of inner construction factors about culture context outside restriction Zeitgeist achieve systematic preservation,...
Color transfer is to alter an image’s color composition by reference the characteristics of another image. In this paper, we build a system called artistic coloring that realizes automatic from famous paintings. It properly extracts wonderful paintings and applies them transfer. Specially, investigate traditional theme extraction methods find their deficiencies. Based on this, quantify processing human painting main colors in palette propose balanced algorithm aimed specially at experiments,...
In-depth analyses of the anti-oxidation behavior and structure γ-TiAl alloys are great significant for their maintenance repair in engineering applications. In this work, fluorine-treated Ti-45Al-8.5Nb oxidized specimens with artificial defects were prepared by isothermal oxidation treatment at 1000 °C. Several characterization methods, including SEM, EDS, XRD TEM, used to evaluate surface microstructure defects. The results indicate that fluorine promoted formation an outer protective film...
Text-based Visual Question Answering (Text-VQA) is a question-answering task to understand scene text, where the text usually recognized by Optical Character Recognition (OCR) systems. However, from OCR systems often includes spelling errors, such as "pepsi" being "peosi". These errors are one of major challenges for Text-VQA To address this, we propose novel method alleviate via token evolution. First, artificially create misspelled tokens in training time, and make system more robust...