Yunhao Ge

ORCID: 0000-0002-8110-9280
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Image and Video Retrieval Techniques
  • Cell Image Analysis Techniques
  • Machine Learning and Data Classification
  • Anomaly Detection Techniques and Applications
  • AI in cancer detection
  • Robotics and Sensor-Based Localization
  • Image Retrieval and Classification Techniques
  • Advanced Image Processing Techniques
  • Radiomics and Machine Learning in Medical Imaging
  • Advanced Graph Neural Networks
  • Medical Imaging Techniques and Applications
  • Video Analysis and Summarization
  • 3D Surveying and Cultural Heritage
  • Advanced X-ray and CT Imaging
  • Natural Language Processing Techniques
  • Image Processing Techniques and Applications
  • COVID-19 diagnosis using AI
  • Adversarial Robustness in Machine Learning
  • Explainable Artificial Intelligence (XAI)
  • Medical Image Segmentation Techniques

Stanford University
2024

University of Southern California
2020-2024

Southern California University for Professional Studies
2020-2024

Nvidia (United States)
2024

Google (United States)
2023

Shanghai Jiao Tong University
2018-2019

United Imaging Healthcare (China)
2019

People 's Hospital of Jilin Province
2018

Physical AI needs to be trained digitally first. It a digital twin of itself, the policy model, and world, world model. In this paper, we present Cosmos World Foundation Model Platform help developers build customized models for their setups. We position foundation model as general-purpose that can fine-tuned into downstream applications. Our platform covers video curation pipeline, pre-trained models, examples post-training tokenizers. To builders solve most critical problems our society,...

10.48550/arxiv.2501.03575 preprint EN arXiv (Cornell University) 2025-01-07

Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from lack transparency and interpretability. While recent developments explainable artificial intelligence attempt bridge this gap (e.g., by visualizing the correlation between input pixels final outputs), these approaches are limited explaining low-level relationships, crucially, do not provide insights on error correction. In work, we propose framework (VRX) interpret...

10.1109/cvpr46437.2021.00223 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

In medical imaging such as PET-MR attenuation correction and MRI-guided radiation therapy, synthesizing CT images from MR plays an important role in obtaining tissue density properties. Recently deep-learning-based image synthesis techniques have attracted much attention because of their superior ability for mapping. However, most the current methods require large scales paired data, which greatly limits usage. Efforts been made to relax a restriction, cycle-consistent adversarial networks...

10.1109/isbi.2019.8759529 article EN 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) 2019-04-01

Multi-modal- image-text models such as CLIP and LiT have demonstrated impressive performance on image classification benchmarks their zero-shot generalization ability is particularly exciting. While the top-5 accuracies of these are very high, top-1 much lower (over 25% gap in some cases). We investigate reasons for this find that many failure cases caused by ambiguity text prompts. First, we develop a simple efficient post-hoc method to identify images whose prediction likely be incorrect,...

10.1109/cvpr52729.2023.01067 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

10.1109/cvpr52733.2024.01331 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Computerized breast cancer diagnosis system has played an import role in early diagnosis. For this purpose, we apply deep learning by using convolutional neural networks (CNN) to classify abnormalities, benign or malignant, mammographic images based on the mini Mammographic Image Analysis Society (mini-MIAS) database. Accuracy, sensitivity, and specific values are observed evaluate performance of CNN. To improve performance, utilize image-preprocessing methods containing cropping, global...

10.1145/3195106.3195163 article EN 2018-02-26

MR to CT image synthesis plays an important role in medical analysis, and its applications included, but not limited PET-MR attenuation correction only radiation therapy planning. Recently, deep learning-based techniques have achieved much success. However, most of the current methods require large scales paired data from two different modalities, which greatly limits their usage as some situation is infeasible obtain. Some efforts been proposed relax this constraint such cycle-consistent...

10.1117/12.2512479 article EN Medical Imaging 2022: Image Processing 2019-03-14

In this paper, a deep learning computer aided diagnosis system (CADs) is proposed for automatic segmentation and classification of melanoma lesions, containing fully convolutional neural network (FCN) specific (CNN). FCN, which consists 28-layer structure, designed with mask region interest (ROI) as its output. Later, the CNN only uses segmented ROI raw image to extract features, while DLCM statistical contrast location features extracted from same are merged into features. Finally, combined...

10.1145/3195106.3195164 article EN 2018-02-26

The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also adapt to new situations. In contrast, deep neural networks one sophisticated fixed mapping inputs outputs. This limits their applicability more dynamic situations, where input output may change with different contexts. A salient example continual learning-learning independent tasks sequentially without forgetting previous tasks. Continual learning multiple in artificial...

10.1109/tnnls.2021.3054423 article EN cc-by IEEE Transactions on Neural Networks and Learning Systems 2021-02-19

In Lifelong Learning (LL), agents continually learn as they encounter new conditions and tasks. Most current LL is limited to a single agent that learns tasks sequentially. Dedicated machinery then deployed mitigate the forgetting of old are learned. This inherently slow. We propose Shared Knowledge (SKILL) challenge, which deploys decentralized population each sequentially different tasks, with all operating independently in parallel. After learning their respective share consolidate...

10.48550/arxiv.2305.15591 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Continual learning aims to emulate the human ability continually accumulate knowledge over sequential tasks. The main challenge is maintain performance on previously learned tasks after new tasks, i.e., avoid catastrophic forgetting. We propose a Channel-wise Lightweight Reprogramming (CLR) approach that helps convolutional neural networks (CNNs) overcome forgetting during continual learning. show CNN model trained an old task (or self-supervised proxy task) could be "reprogrammed" solve by...

10.1109/iccv51070.2023.01723 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Unlike static gesture recognition, a novel real-time prediction system in this study can judge the intention of hand motion and predict exact final before end movement. Flex sensors are used to measure comprehensive data glove, which positioned based on biological muscle distribution characteristics hand. Position, velocity acceleration information extracted from raw while adjacent finger-coupling features also obtained by processing position information. After such as windowing filtering,...

10.1109/icaci.2018.8377532 article EN 2018-03-01

We focus on controllable disentangled representation learning (C-Dis-RL), where users can control the partition of latent space to factorize dataset attributes (concepts) for downstream tasks. Two general problems remain under-explored in current methods: (1) They lack comprehensive disentanglement constraints, especially missing minimization mutual information between different across and observation domains. (2) convexity which is important meaningfully manipulating specific To encourage...

10.1109/wacv56688.2023.00474 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-to-image synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach1 decouples generation into foreground object generation, and contextually coherent background generation. To objects, we employ straightforward textual template, incorporating class name as input prompts. This is fed framework, producing various images set against isolated backgrounds. A...

10.48550/arxiv.2309.05956 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Compared with the traditional mechanical beam deflector in a beam-scanning system, dual-wedge scanning system has several advantages, for example, compact structure, fast speed, and low power consumption. High accuracy is most important factor scanning, but errors caused by machining or assembly adversely affect this accuracy. Horizontal angular appear between incident light central optical axes. By building mathematical model of an ideal trajectory affected errors, paper analyzes types...

10.1364/ao.57.006047 article EN Applied Optics 2018-07-16

We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples generation into foreground object mask and background (context) image generation. For generation, we use simple textual template class name as input DALL-E diverse set of images. A foreground-background segmentation algorithm is then used masks. Next, in order context images, first...

10.48550/arxiv.2206.09592 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Oil sheen on the water surface can indicate a source of hydrocarbon in underlying subaquatic sediments. Here, we develop and test accuracy an algorithm for automated real-time visual monitoring detecting oil sheen. This detection system is part screening (OS-SS) that disturbs sediments monitors formation We first created new near-surface image dataset. then used this dataset to image-based Sheen Prediction Neural Network (OS-Net), classification machine learning model based convolutional...

10.3390/app12178865 article EN cc-by Applied Sciences 2022-09-03

A major challenge in monocular 3D object detection is the limited diversity and quantity of objects real datasets. While augmenting scenes with virtual holds promise to improve both objects, it remains elusive due lack an effective insertion method complex captured scenes. In this work, we study indoor for detection. The main automatically identify plausible physical properties assets (e.g., locations, appearances, sizes, etc.) cluttered To address challenge, propose a physically approach...

10.48550/arxiv.2312.05277 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...