NFDI4DS | UHH-SEMS - Publication Details

Yu-Xiong Wang

ORCID: 0000-0003-4414-0198

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5049973776

Research Areas

Semantic Web and Ontologies
Topic Modeling
Image Processing and 3D Reconstruction
3D Shape Modeling and Analysis
Categorization, perception, and language
3D Surveying and Cultural Heritage
Multi-Agent Systems and Negotiation
Data Mining Algorithms and Applications
Advanced Memory and Neural Computing
Image Retrieval and Classification Techniques
CCD and CMOS Imaging Sensors
Advanced Vision and Imaging
Advanced Image Fusion Techniques
3D Modeling in Geospatial Applications
Radiomics and Machine Learning in Medical Imaging
Image and Signal Denoising Methods
Speech and dialogue systems
Generative Adversarial Networks and Image Synthesis
Neural Networks and Reservoir Computing
Natural Language Processing Techniques
Image Enhancement Techniques

University of Illinois Urbana-Champaign
2024

Manhattan College
2016-2022

University of Alabama in Huntsville
2016

A novel learning-based switching median filter for suppression of impulse noise in highly corrupted colour images

OPENALEX - Publications

Yu-Xiong Wang Jianlong Fu Reza R. Adhami H. Dihn

A novel switching median filter integrated with a learning-based noise detection method is proposed for suppression of impulse in highly corrupted colour images. Noise employs new machine learning algorithm, called margin setting (MS), to detect pixels. MS achieved by classifying and clean pixels decision surface. yields very high accuracy, i.e. zero miss rate fairly low over wide range levels. After detection, scheme the noise-free two-stage (NFTS) triggered. NFTS corrects using two stages....

10.1080/13682199.2015.1104068 article EN The Imaging Science Journal 2016-01-02

Situational Awareness Matters in 3D Vision Language Reasoning

OPENALEX - Publications

Yunze Man Liang-Yan Gui Yu-Xiong Wang

Being able to carry out complicated vision language reasoning tasks in 3D space represents a significant milestone developing household robots and human-centered embodied AI. In this work, we demonstrate that critical distinct challenge is situational awareness, which incorporates two key components: (1) The autonomous agent grounds its self-location based on prompt. (2) answers open-ended questions from the perspective of calculated position. To address challenge, introduce SIG3D, an...

10.48550/arxiv.2406.07544 preprint EN arXiv (Cornell University) 2024-06-11

Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion

OPENALEX - Publications

Linzhan Mou Junkun Chen Yu-Xiong Wang

This paper proposes Instruct 4D-to-4D that achieves 4D awareness and spatial-temporal consistency for 2D diffusion models to generate high-quality instruction-guided dynamic scene editing results. Traditional applications of in often result inconsistency, primarily due their inherent frame-by-frame methodology. Addressing the complexities extending 4D, our key insight is treat a as pseudo-3D scene, decoupled into two sub-problems: achieving temporal video applying these edits scene....

10.48550/arxiv.2406.09402 preprint EN arXiv (Cornell University) 2024-06-13

Separate-and-Enhance: Compositional Finetuning for Text-to-Image Diffusion Models

OPENALEX - Publications

Zhipeng Bao Yijun Li Krishna Kumar Singh Yu-Xiong Wang Martial Hebert

Despite recent significant strides achieved by diffusion-based Text-to-Image (T2I) models, current systems are still less capable of ensuring decent compositional generation aligned with text prompts, particularly for the multi-object generation. In this work, we first show fundamental reasons such misalignment identifying issues related to low attention activation and mask overlaps. Then propose a finetuning framework two novel objectives, Separate loss Enhance loss, that reduce object...

10.1145/3641519.3657527 article EN 2024-07-12

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

OPENALEX - Publications

Yunze Man Shuhong Zheng Zhipeng Bao Martial Hebert Liang-Yan Gui and 1 more

Complex 3D scene understanding has gained increasing attention, with encoding strategies playing a crucial role in this success. However, the optimal for various scenarios remain unclear, particularly compared to their image-based counterparts. To address issue, we present comprehensive study that probes visual models understanding, identifying strengths and limitations of each model across different scenarios. Our evaluation spans seven vision foundation encoders, including image-based,...

10.48550/arxiv.2409.03757 preprint EN arXiv (Cornell University) 2024-09-05

Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

OPENALEX - Publications

Yifei Zhou Qianqian Yang Kaixiang Lin Min Bai Xiong Zhou and 3 more

The vision of a broadly capable and goal-directed agent, such as an Internet-browsing agent in the digital world household humanoid physical world, has rapidly advanced, thanks to generalization capability foundation models. Such generalist needs have large diverse skill repertoire, finding directions between two travel locations buying specific items from Internet. If each be specified manually through fixed set human-annotated instructions, agent's repertoire will necessarily limited due...

10.48550/arxiv.2412.13194 preprint EN arXiv (Cornell University) 2024-12-17

An Empirical Comparison of Code Generation Approaches for Ansible

OPENALEX - Publications

B.L. Darnell Hetarth Chopra Aaron Councilman David Grove Yu-Xiong Wang and 1 more

10.1145/3643661.3643951 article EN 2024-04-15

InstructG2I: Synthesizing Images from Multimodal Attributed Graphs

OPENALEX - Publications

Bowen Jin Ziqi Pang Bingjun Guo Yu-Xiong Wang Jiaxuan You and 1 more

In this paper, we approach an overlooked yet critical task Graph2Image: generating images from multimodal attributed graphs (MMAGs). This poses significant challenges due to the explosion in graph size, dependencies among entities, and need for controllability conditions. To address these challenges, propose a context-conditioned diffusion model called InstructG2I. InstructG2I first exploits structure information conduct informative neighbor sampling by combining personalized page rank...

10.48550/arxiv.2410.07157 preprint EN arXiv (Cornell University) 2024-10-09

Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning

OPENALEX - Publications

Yuxiang Lu Shengcao Cao Yu-Xiong Wang

Vision Foundation Models (VFMs) have demonstrated outstanding performance on numerous downstream tasks. However, due to their inherent representation biases originating from different training paradigms, VFMs exhibit advantages and disadvantages across distinct vision Although amalgamating the strengths of multiple for tasks is an intuitive strategy, effectively exploiting these remains a significant challenge. In this paper, we propose novel versatile "Swiss Army Knife" (SAK) solution,...

10.48550/arxiv.2410.14633 preprint EN arXiv (Cornell University) 2024-10-18

ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing

OPENALEX - Publications

Junkun Chen Yu-Xiong Wang

This paper proposes ProEdit - a simple yet effective framework for high-quality 3D scene editing guided by diffusion distillation in novel progressive manner. Inspired the crucial observation that multi-view inconsistency is rooted model's large feasible output space (FOS), our controls size of FOS and reduces decomposing overall task into several subtasks, which are then executed progressively on scene. Within this framework, we design difficulty-aware subtask decomposition scheduler an...

10.48550/arxiv.2411.05006 preprint EN arXiv (Cornell University) 2024-11-07

Space Situational Awareness (SSA) and Quantum Neuromorphic Computing

OPENALEX - Publications

Christopher P. Barnes Alec Puran Anthony Beninati Nicolas Douard Martin A. Nowak and 7 more

The purpose of this study is to present agile, intelligent, and efficient computer vision architectures, operating on quantum neuromorphic computing, as part a Space Situational Awareness (SSA) network. Quantum paired with polarimetric Dynamic Vision Sensors p(DVS) principles, would give rise the next generation highly engineering systems for SSA, at fast speeds, while reduced bandwidth, low-power, low-memory. A deep-learning network has been designed high accuracy classify different target...

10.1109/ist55454.2022.9827746 article EN 2022-06-21

Coming Soon ...