NFDI4DS | UHH-SEMS - Publication Details

WorDepth: Variational Language Prior for Monocular Depth Estimation

OPENALEX - Publications

Ziyao Zeng Daniel Wang Fengyu Yang Hyoungseob Park Stefano Soatto and 2 more

10.1109/cvpr52733.2024.00927 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Tri-Ergon: Fine-Grained Video-to-Audio Generation with Multi-Modal Conditions and LUFS Control

OPENALEX - Publications

Bingliang Li Fengyu Yang Yuxin Mao Qingwen Ye Hongkai Chen and 1 more

Video-to-audio (V2A) generation utilizes visual-only video features to produce realistic sounds that correspond the scene. However, current V2A models often lack fine-grained control over generated audio, especially in terms of loudness variation and incorporation multi-modal conditions. To overcome these limitations, we introduce Tri-Ergon, a diffusion-based model incorporates textual, auditory, pixel-level visual prompts enable detailed semantically rich audio synthesis. Additionally,...

10.1609/aaai.v39i5.32487 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Boosting Detection in Crowd Analysis via Underutilized Output Features

OPENALEX - Publications

Shaokai Wu Fengyu Yang

Detection-based methods have been viewed unfavorably in crowd analysis due to their poor performance dense crowds. However, we argue that the potential of these has underestimated, as they offer crucial information for is often ignored. Specifically, area size and confidence score output proposals bounding boxes provide insight into scale density crowd. To leverage underutilized features, propose Crowd Hat, a plug-and-play module can be easily integrated with existing detection models. This...

10.1109/cvpr52729.2023.01498 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Applied method for water-body segmentation based on mask R-CNN

OPENALEX - Publications

Fengyu Yang Tao Feng Ganyang Xu Ying Chen

There exist thousands of water bodies in watersheds, including large-scale bodies, such as reservoirs, and small-scale lakes, ponds, etc. In basin flood forecasting other hydrology-related tasks, play an important role the flooding process. The method efficiently segmenting from remote sensing images (RSIs) is still a popular research topic fields computer science sensing. We propose model based on mask R-CNN to automatically detect segment RSIs, thereby avoiding complex operations manual...

10.1117/1.jrs.14.014502 article EN Journal of Applied Remote Sensing 2020-01-09

MSF-YOLO: A multi-scale features fusion-based method for small object detection

OPENALEX - Publications

Fengyu Yang Jiaqi Zhou Yuan Chen Jie Liao Mingxiang Yang

10.1007/s11042-023-17818-0 article EN Multimedia Tools and Applications 2024-01-06

Vulnerability Detection Based on Enhanced Graph Representation Learning

OPENALEX - Publications

Peng Xiao Qibin Xiao Xusheng Zhang Yumei Wu Fengyu Yang

The detection of program vulnerabilities remains a challenging task in software security. existing vulnerability methods rarely consider the multidimensional feature space complementarity graph structures, which easily overlooks contextual environment features and syntax structure features. This disadvantage leads to insufficient performance capturing complex structural features, hinders improvement accuracy. To address this issue, paper introduces novel method, EnGS2F, adopts representation...

10.1109/tifs.2024.3392536 article EN IEEE Transactions on Information Forensics and Security 2024-01-01

WorDepth: Variational Language Prior for Monocular Depth Estimation

OPENALEX - Publications

Ziyao Zeng Daniel Wang Fengyu Yang Hyoungseob Park Yangchao Wu and 4 more

Three-dimensional (3D) reconstruction from a single image is an ill-posed problem with inherent ambiguities, i.e. scale. Predicting 3D scene text description(s) similarly ill-posed, spatial arrangements of objects described. We investigate the question whether two inherently ambiguous modalities can be used in conjunction to produce metric-scaled reconstructions. To test this, we focus on monocular depth estimation, predicting dense map image, but additional caption describing scene. this...

10.48550/arxiv.2404.03635 preprint EN arXiv (Cornell University) 2024-04-04

VCISR: Blind Single Image Super-Resolution with Video Compression Synthetic Data

OPENALEX - Publications

Boyang Wang Bowen Liu Shiyu Liu Fengyu Yang

In the blind single image super-resolution (SISR) task, existing works have been successful in restoring image-level unknown degradations. However, when a video frame becomes input, these usually fail to address degradations caused by compression, such as mosquito noise, ringing, blockiness, and staircase noise. this work, we for first time, present compressionbased degradation model synthesize low-resolution data SISR task. Our proposed synthesizing method is widely applicable datasets, so...

10.1109/wacv57701.2024.00425 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03

Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model

OPENALEX - Publications

Jie Yang Bingliang Li Fengyu Yang Ailing Zeng Lei Zhang and 1 more

This paper investigates the problem of current HOI detection methods and introduces DiffHOI, a novel scheme grounded on pre-trained text-image diffusion model, which enhances detector's performance via improved data diversity representation. We demonstrate that internal representation space frozen text-to-image model is highly relevant to verb concepts their corresponding context. Accordingly, we propose an adapter-style tuning method extract various semantic associated from CLIP enhance...

10.48550/arxiv.2305.12252 preprint EN other-oa arXiv (Cornell University) 2023-01-01

FISRCN: a single small-sized image super-resolution convolutional neural network by using edge detection

OPENALEX - Publications

Luoyi Kong Fengbin Wang Fengyu Yang Lu Leng Haotian Zhang

10.1007/s11042-023-15380-3 article EN Multimedia Tools and Applications 2023-07-28

Fine-Grained Software Defect Prediction Based on the Method-Call Sequence

OPENALEX - Publications

Fengyu Yang Yaxuan Huang Haoming Xu Peng Xiao Wei Zheng

Currently, software defect-prediction technology is being extensively researched in the design of metrics. However, research objects are mainly limited to coarse-grained entities such as classes, files, and packages, there a wide range defects that difficult predict actual situations. To further explore information between sequences method calls learn code semantics syntactic structure methods, we generated method-call sequence retains context token representing semantic information. We...

10.1155/2022/4311548 article EN cc-by Computational Intelligence and Neuroscience 2022-08-03

A Method-Level Defect Prediction Approach Based on Structural Features of Method-Calling Network

OPENALEX - Publications

Fengyu Yang Haoming Xu Peng Xiao Fa Zhong Guangdong Zeng

Software defect prediction models help testers find program modules that have a high probability of having defects. A method-calling network can express the dependencies between methods in program. Existing approaches do not sufficiently utilize to characterize structural features methods. To address this problem, study, it is proposed for first time characteristics are obtained by analyzing network, and new approach at method-level. Specifically was constructed metrics were obtained. Next,...

10.1109/access.2023.3239266 article EN cc-by-nc-nd IEEE Access 2023-01-01

Interpretable Software Defect Prediction Incorporating Multiple Rules

OPENALEX - Publications

Fengyu Yang Guangdong Zeng Fa Zhong Wei Zheng Peng Xiao

Software defect prediction models are of great importance in software testing, however, they also face the problem model uninterpretability. Association rules have good accuracy and interpretability, being widely used interpretable rule mining scenarios, but there some common problems with current research: 1) Data unbalance seriously affects mined rules; 2) Most studies treat features as equally important ignore feature contribution degree; 3) Classification by default easily reduces...

10.1109/saner56733.2023.00114 article EN 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2023-03-01

LineFlowDP: A Deep Learning-Based Two-Phase Approach for Line-Level Defect Prediction

OPENALEX - Publications

Fengyu Yang Fa Zhong Guangdong Zeng Peng Xiao Wei Zheng

10.1007/s10664-023-10439-z article EN Empirical Software Engineering 2024-02-23

CfExplainer: Explainable just-in-time defect prediction based on counterfactuals

OPENALEX - Publications

Fengyu Yang Guangdong Zeng Fa Zhong Peng Xiao Wei Zheng and 1 more

10.1016/j.jss.2024.112182 article EN Journal of Systems and Software 2024-08-05

Publishing the Data of the Smithsonian American Art Museum to the Linked Data Cloud

OPENALEX - Publications

Pedro Szekely Craig A. Knoblock Fengyu Yang Eleanor Fink Shubham Gupta and 2 more

Museums around the world have built databases with metadata about millions of objects, their history, people who created them, and entities they represent. This data is stored in proprietary not readily available for use. Recently, museums embraced Semantic Web as a means to make this world, but experience so far shows that publishing museum linked cloud difficult: are large complex, information richly structured varies from museum, it difficult link other datasets. paper describes process...

10.3366/ijhac.2014.0104 article EN International Journal of Humanities and Arts Computing 2014-02-10

A personalized programming exercise recommendation algorithm based on knowledge structure tree

OPENALEX - Publications

Wei Zheng Qing Du Yongjian Fan Lijuan Tan Chuanlin Xia and 1 more

Personalized exercise recommendation is an important research project in the field of online learning, which can explore students’ strengths and weaknesses tailor exercises for them. However, programming differs from other disciplines or types due to comprehensive specificity program debugging. In order assist students learning programming, this paper proposes a algorithm based on knowledge structure tree (KSTER). Firstly, provides calculation method quantifying cognitive level obtain their...

10.3233/jifs-211499 article EN Journal of Intelligent & Fuzzy Systems 2021-12-28

Fractal Analysis of Overlapping Box Covering Algorithm for Complex Networks

OPENALEX - Publications

Wei Xing Zheng Qianjing You Fangli Liu Fengyu Yang Xin Fan

Due to extensive research on complex networks, fractal analysis with scale invariance is applied measure the topological structure and self-similarity of networks. Fractal dimension can be used quantify properties However, in existing box covering algorithms, accurately calculating networks still an NP-hard problem. Therefore, this paper, improved overlapping algorithm proposed explore a more accurate effective method calculate Moreover, order verify effectiveness algorithm, six compared...

10.1109/access.2020.2981044 article EN cc-by IEEE Access 2020-01-01

RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions

OPENALEX - Publications

Ziyao Zeng Yangchao Wu Hyoungseob Park Daniel Wang Fengyu Yang and 4 more

We propose a method for metric-scale monocular depth estimation. Inferring from single image is an ill-posed problem due to the loss of scale perspective projection during formation process. Any chosen bias, typically stemming training on dataset; hence, existing works have instead opted use relative (normalized, inverse) depth. Our goal recover metric-scaled maps through linear transformation. The crux our lies in observation that certain objects (e.g., cars, trees, street signs) are found...

10.48550/arxiv.2410.02924 preprint EN arXiv (Cornell University) 2024-10-03

PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation

OPENALEX - Publications

Ziyao Zeng Jingcheng Ni Daniel Wang Patrick Rim Younjoon Chung and 3 more

This paper explores the potential of leveraging language priors learned by text-to-image diffusion models to address ambiguity and visual nuisance in monocular depth estimation. Particularly, traditional estimation suffers from inherent due absence stereo or multi-view cues, lack robustness vision. We argue that prior can enhance geometric aligned with description, which is during pre-training. To generate images reflect text properly, model must comprehend size shape specified objects,...

10.48550/arxiv.2411.16750 preprint EN arXiv (Cornell University) 2024-11-24

A Fine-Grained Defect Prediction Method Based on Drift-Immune Graph Neural Networks

OPENALEX - Publications

Fengyu Yang Fa Zhong Xiaohui Wei Guangdong Zeng

10.32604/cmc.2024.057697 article EN Computers, materials & continua/Computers, materials & continua (Print) 2024-01-01

Tri-Ergon: Fine-grained Video-to-Audio Generation with Multi-modal Conditions and LUFS Control

OPENALEX - Publications

Bingliang Li Fengyu Yang Yuxin Mao Qingwen Ye Hongkai Chen and 1 more

Video-to-audio (V2A) generation utilizes visual-only video features to produce realistic sounds that correspond the scene. However, current V2A models often lack fine-grained control over generated audio, especially in terms of loudness variation and incorporation multi-modal conditions. To overcome these limitations, we introduce Tri-Ergon, a diffusion-based model incorporates textual, auditory, pixel-level visual prompts enable detailed semantically rich audio synthesis. Additionally,...

10.48550/arxiv.2412.20378 preprint EN arXiv (Cornell University) 2024-12-29

Test Data Automatic Generation Based on Modified Condition/Decision Coverage Criteria

OPENALEX - Publications

Xin Fan Wei Zheng Fengyu Yang Qijun Liang

Software testing is one of the most important means that guarantee software quality and reliability.Meanwhile, improving automation level also very to ensure development decrease cost.DO-178B provides different criteria structure coverage for levels software.This paper presents a test data automatic generation method based on genetic algorithm.This approach builds decision tree from truth table extract minimum set according modified condition/decision criteria, converts problem case another...

10.2991/csic-15.2015.69 article EN cc-by-nc Advances in computer science research 2015-01-01