NFDI4DS | UHH-SEMS - Publication Details

Yunhao Ge

ORCID: 0000-0002-8110-9280

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5038696465

Research Areas

Domain Adaptation and Few-Shot Learning
Advanced Neural Network Applications
Multimodal Machine Learning Applications
Human Pose and Action Recognition
Generative Adversarial Networks and Image Synthesis
Advanced Image and Video Retrieval Techniques
Cell Image Analysis Techniques
Machine Learning and Data Classification
Anomaly Detection Techniques and Applications
AI in cancer detection
Robotics and Sensor-Based Localization
Image Retrieval and Classification Techniques
Advanced Image Processing Techniques
Radiomics and Machine Learning in Medical Imaging
Advanced Graph Neural Networks
Medical Imaging Techniques and Applications
Video Analysis and Summarization
3D Surveying and Cultural Heritage
Advanced X-ray and CT Imaging
Natural Language Processing Techniques
Image Processing Techniques and Applications
COVID-19 diagnosis using AI
Adversarial Robustness in Machine Learning
Explainable Artificial Intelligence (XAI)
Medical Image Segmentation Techniques

Stanford University
2024

University of Southern California
2020-2024

Southern California University for Professional Studies
2020-2024

Nvidia (United States)
2024

Google (United States)
2023

Shanghai Jiao Tong University
2018-2019

United Imaging Healthcare (China)
2019

People 's Hospital of Jilin Province
2018

A collective AI via lifelong learning and sharing at the edge

OPENALEX - Publications

Andrea Soltoggio Eseoghene Ben-Iwhiwhu Vladimir Braverman Eric Eaton Benjamin R. Epstein and 25 more

10.1038/s42256-024-00800-2 article EN Nature Machine Intelligence 2024-03-22

Cosmos World Foundation Model Platform for Physical AI

OPENALEX - Publications

NVIDIA NULL AUTHOR_ID Niket Agarwal Adnan Ali Madhu Bala and 74 more

Physical AI needs to be trained digitally first. It a digital twin of itself, the policy model, and world, world model. In this paper, we present Cosmos World Foundation Model Platform help developers build customized models for their setups. We position foundation model as general-purpose that can fine-tuned into downstream applications. Our platform covers video curation pipeline, pre-trained models, examples post-training tokenizers. To builders solve most critical problems our society,...

10.48550/arxiv.2501.03575 preprint EN arXiv (Cornell University) 2025-01-07

A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

OPENALEX - Publications

Yunhao Ge Yao Xiao Zhi Xu Meng Zheng Srikrishna Karanam and 3 more

Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from lack transparency and interpretability. While recent developments explainable artificial intelligence attempt bridge this gap (e.g., by visualizing the correlation between input pixels final outputs), these approaches are limited explaining low-level relationships, crucially, do not provide insights on error correction. In work, we propose framework (VRX) interpret...

10.1109/cvpr46437.2021.00223 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Unpaired Mr to CT Synthesis with Explicit Structural Constrained Adversarial Learning

OPENALEX - Publications

Yunhao Ge Dongming Wei Zhong Xue Qian Wang Xiang Sean Zhou and 2 more

In medical imaging such as PET-MR attenuation correction and MRI-guided radiation therapy, synthesizing CT images from MR plays an important role in obtaining tissue density properties. Recently deep-learning-based image synthesis techniques have attracted much attention because of their superior ability for mapping. However, most the current methods require large scales paired data, which greatly limits usage. Efforts been made to relax a restriction, cycle-consistent adversarial networks...

10.1109/isbi.2019.8759529 article EN 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) 2019-04-01

Improving Zero-shot Generalization and Robustness of Multi-Modal Models

OPENALEX - Publications

Yunhao Ge Jie Ren Andrew Gallagher Yuxiao Wang Ming–Hsuan Yang and 4 more

Multi-modal- image-text models such as CLIP and LiT have demonstrated impressive performance on image classification benchmarks their zero-shot generalization ability is particularly exciting. While the top-5 accuracies of these are very high, top-1 much lower (over 25% gap in some cases). We investigate reasons for this find that many failure cases caused by ambiguity text prompts. First, we develop a simple efficient post-hoc method to identify images whose prediction likely be incorrect,...

10.1109/cvpr52729.2023.01067 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

OPENALEX - Publications

Yunhao Ge Xiaohui Zeng Jacob Samuel Huffman Tsung‐Yi Lin Ming-Yu Liu and 1 more

10.1109/cvpr52733.2024.01331 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Benign and malignant mammographic image classification based on Convolutional Neural Networks

OPENALEX - Publications

Bin Li Yunhao Ge Yanzheng Zhao Enguang Guan Weixin Yan

Computerized breast cancer diagnosis system has played an import role in early diagnosis. For this purpose, we apply deep learning by using convolutional neural networks (CNN) to classify abnormalities, benign or malignant, mammographic images based on the mini Mammographic Image Analysis Society (mini-MIAS) database. Accuracy, sensitivity, and specific values are observed evaluate performance of CNN. To improve performance, utilize image-preprocessing methods containing cropping, global...

10.1145/3195106.3195163 article EN 2018-02-26

Unpaired whole-body MR to CT synthesis with correlation coefficient constrained adversarial learning

OPENALEX - Publications

Yunhao Ge Zhong Xue Tuoyu Cao Shu Liao

MR to CT image synthesis plays an important role in medical analysis, and its applications included, but not limited PET-MR attenuation correction only radiation therapy planning. Recently, deep learning-based techniques have achieved much success. However, most of the current methods require large scales paired data from two different modalities, which greatly limits their usage as some situation is infeasible obtain. Some efforts been proposed relax this constraint such cycle-consistent...

10.1117/12.2512479 article EN Medical Imaging 2022: Image Processing 2019-03-14

Melanoma Segmentation and Classification in Clinical Images Using Deep Learning

OPENALEX - Publications

Yunhao Ge Bin Li Yanzheng Zhao Enguang Guan Weixin Yan

In this paper, a deep learning computer aided diagnosis system (CADs) is proposed for automatic segmentation and classification of melanoma lesions, containing fully convolutional neural network (FCN) specific (CNN). FCN, which consists 28-layer structure, designed with mask region interest (ROI) as its output. Later, the CNN only uses segmented ROI raw image to extract features, while DLCM statistical contrast location features extracted from same are merged into features. Finally, combined...

10.1145/3195106.3195164 article EN 2018-02-26

Beneficial Perturbation Network for Designing General Adaptive Artificial Intelligence Systems

OPENALEX - Publications

Shixian Wen Amanda Rios Yunhao Ge Laurent Itti

The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also adapt to new situations. In contrast, deep neural networks one sophisticated fixed mapping inputs outputs. This limits their applicability more dynamic situations, where input output may change with different contexts. A salient example continual learning-learning independent tasks sequentially without forgetting previous tasks. Continual learning multiple in artificial...

10.1109/tnnls.2021.3054423 article EN cc-by IEEE Transactions on Neural Networks and Learning Systems 2021-02-19

Lightweight Learner for Shared Knowledge Lifelong Learning

OPENALEX - Publications

Yunhao Ge Yuecheng Li Di Wu Ao Xu Adam M. Jones and 10 more

In Lifelong Learning (LL), agents continually learn as they encounter new conditions and tasks. Most current LL is limited to a single agent that learns tasks sequentially. Dedicated machinery then deployed mitigate the forgetting of old are learned. This inherently slow. We propose Shared Knowledge (SKILL) challenge, which deploys decentralized population each sequentially different tasks, with all operating independently in parallel. After learning their respective share consolidate...

10.48550/arxiv.2305.15591 preprint EN other-oa arXiv (Cornell University) 2023-01-01

CLR: Channel-wise Lightweight Reprogramming for Continual Learning

OPENALEX - Publications

Yunhao Ge Yuecheng Li Shuo Ni Jiaping Zhao Ming–Hsuan Yang and 1 more

Continual learning aims to emulate the human ability continually accumulate knowledge over sequential tasks. The main challenge is maintain performance on previously learned tasks after new tasks, i.e., avoid catastrophic forgetting. We propose a Channel-wise Lightweight Reprogramming (CLR) approach that helps convolutional neural networks (CNNs) overcome forgetting during continual learning. show CNN model trained an old task (or self-supervised proxy task) could be "reprogrammed" solve by...

10.1109/iccv51070.2023.01723 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

A real-time gesture prediction system using neural networks and multimodal fusion based on data glove

OPENALEX - Publications

Yunhao Ge Bin Li Weixin Yan Yanzheng Zhao

Unlike static gesture recognition, a novel real-time prediction system in this study can judge the intention of hand motion and predict exact final before end movement. Flex sensors are used to measure comprehensive data glove, which positioned based on biological muscle distribution characteristics hand. Position, velocity acceleration information extracted from raw while adjacent finger-coupling features also obtained by processing position information. After such as windowing filtering,...

10.1109/icaci.2018.8377532 article EN 2018-03-01

Encouraging Disentangled and Convex Representation with Controllable Interpolation Regularization

OPENALEX - Publications

Yunhao Ge Zhi Xu Yao Xiao Xin Gan Yunkui Pang and 1 more

We focus on controllable disentangled representation learning (C-Dis-RL), where users can control the partition of latent space to factorize dataset attributes (concepts) for downstream tasks. Two general problems remain under-explored in current methods: (1) They lack comprehensive disentanglement constraints, especially missing minimization mutual information between different across and observation domains. (2) convexity which is important meaningfully manipulating specific To encourage...

10.1109/wacv56688.2023.00474 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

OPENALEX - Publications

Yunhao Ge Jiashu Xu Brian Nlong Zhao Neel Joshi Laurent Itti and 1 more

We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-to-image synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach1 decouples generation into foreground object generation, and contextually coherent background generation. To objects, we employ straightforward textual template, incorporating class name as input prompts. This is fed framework, producing various images set against isolated backgrounds. A...

10.48550/arxiv.2309.05956 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Effect of mechanical error on dual-wedge laser scanning system and error correction

OPENALEX - Publications

Yunhao Ge Jihao Liu Fenfen Xue Enguang Guan Weixin Yan and 1 more

Compared with the traditional mechanical beam deflector in a beam-scanning system, dual-wedge scanning system has several advantages, for example, compact structure, fast speed, and low power consumption. High accuracy is most important factor scanning, but errors caused by machining or assembly adversely affect this accuracy. Horizontal angular appear between incident light central optical axes. By building mathematical model of an ideal trajectory affected errors, paper analyzes types...

10.1364/ao.57.006047 article EN Applied Optics 2018-07-16

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection

OPENALEX - Publications

Yunhao Ge Jiashu Xu Brian Nlong Zhao Laurent Itti Vibhav Vineet

We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples generation into foreground object mask and background (context) image generation. For generation, we use simple textual template class name as input DALL-E diverse set of images. A foreground-background segmentation algorithm is then used masks. Next, in order context images, first...

10.48550/arxiv.2206.09592 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Application of Transfer Learning and Convolutional Neural Networks for Autonomous Oil Sheen Monitoring

OPENALEX - Publications

Jialin Dong Katherine Sitler Joseph Scalia Yunhao Ge Paul Bireta and 3 more

Oil sheen on the water surface can indicate a source of hydrocarbon in underlying subaquatic sediments. Here, we develop and test accuracy an algorithm for automated real-time visual monitoring detecting oil sheen. This detection system is part screening (OS-SS) that disturbs sediments monitors formation We first created new near-surface image dataset. then used this dataset to image-based Sheen Prediction Neural Network (OS-Net), classification machine learning model based convolutional...

10.3390/app12178865 article EN cc-by Applied Sciences 2022-09-03

3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection

OPENALEX - Publications

Yunhao Ge Hong-Xing Yu Cheng Zhao Yuliang Guo Xinyu Huang and 3 more

A major challenge in monocular 3D object detection is the limited diversity and quantity of objects real datasets. While augmenting scenes with virtual holds promise to improve both objects, it remains elusive due lack an effective insertion method complex captured scenes. In this work, we study indoor for detection. The main automatically identify plausible physical properties assets (e.g., locations, appearances, sizes, etc.) cluttered To address challenge, propose a physically approach...

10.48550/arxiv.2312.05277 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Coming Soon ...