- Software Reliability and Analysis Research
- Software Engineering Research
- Software System Performance and Reliability
- Software Testing and Debugging Techniques
- Semantic Web and Ontologies
- Advanced Vision and Imaging
- Image Processing Techniques and Applications
- Anomaly Detection Techniques and Applications
- Network Security and Intrusion Detection
- Video Surveillance and Tracking Methods
- Flood Risk Assessment and Management
- Recommender Systems and Techniques
- Complex Network Analysis Techniques
- Data Quality and Management
- Machine Learning and Data Classification
- Human Mobility and Location-Based Analysis
- Image and Object Detection Techniques
- Advanced Image Processing Techniques
- Industrial Vision Systems and Defect Detection
- Digital and Traditional Archives Management
- Air Quality and Health Impacts
- Bioinformatics and Genomic Networks
- Vehicle License Plate Recognition
- Visual Attention and Saliency Detection
- Computational and Text Analysis Methods
Nanchang Hangkong University
2009-2024
University of Michigan
2023-2024
Yale University
2024
Nanjing University of Aeronautics and Astronautics
2022
Video-to-audio (V2A) generation utilizes visual-only video features to produce realistic sounds that correspond the scene. However, current V2A models often lack fine-grained control over generated audio, especially in terms of loudness variation and incorporation multi-modal conditions. To overcome these limitations, we introduce Tri-Ergon, a diffusion-based model incorporates textual, auditory, pixel-level visual prompts enable detailed semantically rich audio synthesis. Additionally,...
Detection-based methods have been viewed unfavorably in crowd analysis due to their poor performance dense crowds. However, we argue that the potential of these has underestimated, as they offer crucial information for is often ignored. Specifically, area size and confidence score output proposals bounding boxes provide insight into scale density crowd. To leverage underutilized features, propose Crowd Hat, a plug-and-play module can be easily integrated with existing detection models. This...
There exist thousands of water bodies in watersheds, including large-scale bodies, such as reservoirs, and small-scale lakes, ponds, etc. In basin flood forecasting other hydrology-related tasks, play an important role the flooding process. The method efficiently segmenting from remote sensing images (RSIs) is still a popular research topic fields computer science sensing. We propose model based on mask R-CNN to automatically detect segment RSIs, thereby avoiding complex operations manual...
The detection of program vulnerabilities remains a challenging task in software security. existing vulnerability methods rarely consider the multidimensional feature space complementarity graph structures, which easily overlooks contextual environment features and syntax structure features. This disadvantage leads to insufficient performance capturing complex structural features, hinders improvement accuracy. To address this issue, paper introduces novel method, EnGS2F, adopts representation...
Three-dimensional (3D) reconstruction from a single image is an ill-posed problem with inherent ambiguities, i.e. scale. Predicting 3D scene text description(s) similarly ill-posed, spatial arrangements of objects described. We investigate the question whether two inherently ambiguous modalities can be used in conjunction to produce metric-scaled reconstructions. To test this, we focus on monocular depth estimation, predicting dense map image, but additional caption describing scene. this...
In the blind single image super-resolution (SISR) task, existing works have been successful in restoring image-level unknown degradations. However, when a video frame becomes input, these usually fail to address degradations caused by compression, such as mosquito noise, ringing, blockiness, and staircase noise. this work, we for first time, present compressionbased degradation model synthesize low-resolution data SISR task. Our proposed synthesizing method is widely applicable datasets, so...
This paper investigates the problem of current HOI detection methods and introduces DiffHOI, a novel scheme grounded on pre-trained text-image diffusion model, which enhances detector's performance via improved data diversity representation. We demonstrate that internal representation space frozen text-to-image model is highly relevant to verb concepts their corresponding context. Accordingly, we propose an adapter-style tuning method extract various semantic associated from CLIP enhance...
Currently, software defect-prediction technology is being extensively researched in the design of metrics. However, research objects are mainly limited to coarse-grained entities such as classes, files, and packages, there a wide range defects that difficult predict actual situations. To further explore information between sequences method calls learn code semantics syntactic structure methods, we generated method-call sequence retains context token representing semantic information. We...
Software defect prediction models help testers find program modules that have a high probability of having defects. A method-calling network can express the dependencies between methods in program. Existing approaches do not sufficiently utilize to characterize structural features methods. To address this problem, study, it is proposed for first time characteristics are obtained by analyzing network, and new approach at method-level. Specifically was constructed metrics were obtained. Next,...
Software defect prediction models are of great importance in software testing, however, they also face the problem model uninterpretability. Association rules have good accuracy and interpretability, being widely used interpretable rule mining scenarios, but there some common problems with current research: 1) Data unbalance seriously affects mined rules; 2) Most studies treat features as equally important ignore feature contribution degree; 3) Classification by default easily reduces...
Museums around the world have built databases with metadata about millions of objects, their history, people who created them, and entities they represent. This data is stored in proprietary not readily available for use. Recently, museums embraced Semantic Web as a means to make this world, but experience so far shows that publishing museum linked cloud difficult: are large complex, information richly structured varies from museum, it difficult link other datasets. paper describes process...
Personalized exercise recommendation is an important research project in the field of online learning, which can explore students’ strengths and weaknesses tailor exercises for them. However, programming differs from other disciplines or types due to comprehensive specificity program debugging. In order assist students learning programming, this paper proposes a algorithm based on knowledge structure tree (KSTER). Firstly, provides calculation method quantifying cognitive level obtain their...
Due to extensive research on complex networks, fractal analysis with scale invariance is applied measure the topological structure and self-similarity of networks. Fractal dimension can be used quantify properties However, in existing box covering algorithms, accurately calculating networks still an NP-hard problem. Therefore, this paper, improved overlapping algorithm proposed explore a more accurate effective method calculate Moreover, order verify effectiveness algorithm, six compared...
We propose a method for metric-scale monocular depth estimation. Inferring from single image is an ill-posed problem due to the loss of scale perspective projection during formation process. Any chosen bias, typically stemming training on dataset; hence, existing works have instead opted use relative (normalized, inverse) depth. Our goal recover metric-scaled maps through linear transformation. The crux our lies in observation that certain objects (e.g., cars, trees, street signs) are found...
This paper explores the potential of leveraging language priors learned by text-to-image diffusion models to address ambiguity and visual nuisance in monocular depth estimation. Particularly, traditional estimation suffers from inherent due absence stereo or multi-view cues, lack robustness vision. We argue that prior can enhance geometric aligned with description, which is during pre-training. To generate images reflect text properly, model must comprehend size shape specified objects,...
Video-to-audio (V2A) generation utilizes visual-only video features to produce realistic sounds that correspond the scene. However, current V2A models often lack fine-grained control over generated audio, especially in terms of loudness variation and incorporation multi-modal conditions. To overcome these limitations, we introduce Tri-Ergon, a diffusion-based model incorporates textual, auditory, pixel-level visual prompts enable detailed semantically rich audio synthesis. Additionally,...
Software testing is one of the most important means that guarantee software quality and reliability.Meanwhile, improving automation level also very to ensure development decrease cost.DO-178B provides different criteria structure coverage for levels software.This paper presents a test data automatic generation method based on genetic algorithm.This approach builds decision tree from truth table extract minimum set according modified condition/decision criteria, converts problem case another...