- Domain Adaptation and Few-Shot Learning
- Advanced Neural Network Applications
- Multimodal Machine Learning Applications
- Human Pose and Action Recognition
- Generative Adversarial Networks and Image Synthesis
- Advanced Image and Video Retrieval Techniques
- Cell Image Analysis Techniques
- Machine Learning and Data Classification
- Anomaly Detection Techniques and Applications
- AI in cancer detection
- Robotics and Sensor-Based Localization
- Image Retrieval and Classification Techniques
- Advanced Image Processing Techniques
- Radiomics and Machine Learning in Medical Imaging
- Advanced Graph Neural Networks
- Medical Imaging Techniques and Applications
- Video Analysis and Summarization
- 3D Surveying and Cultural Heritage
- Advanced X-ray and CT Imaging
- Natural Language Processing Techniques
- Image Processing Techniques and Applications
- COVID-19 diagnosis using AI
- Adversarial Robustness in Machine Learning
- Explainable Artificial Intelligence (XAI)
- Medical Image Segmentation Techniques
Stanford University
2024
University of Southern California
2020-2024
Southern California University for Professional Studies
2020-2024
Nvidia (United States)
2024
Google (United States)
2023
Shanghai Jiao Tong University
2018-2019
United Imaging Healthcare (China)
2019
People 's Hospital of Jilin Province
2018
Physical AI needs to be trained digitally first. It a digital twin of itself, the policy model, and world, world model. In this paper, we present Cosmos World Foundation Model Platform help developers build customized models for their setups. We position foundation model as general-purpose that can fine-tuned into downstream applications. Our platform covers video curation pipeline, pre-trained models, examples post-training tokenizers. To builders solve most critical problems our society,...
Despite substantial progress in applying neural networks (NN) to a wide variety of areas, they still largely suffer from lack transparency and interpretability. While recent developments explainable artificial intelligence attempt bridge this gap (e.g., by visualizing the correlation between input pixels final outputs), these approaches are limited explaining low-level relationships, crucially, do not provide insights on error correction. In work, we propose framework (VRX) interpret...
In medical imaging such as PET-MR attenuation correction and MRI-guided radiation therapy, synthesizing CT images from MR plays an important role in obtaining tissue density properties. Recently deep-learning-based image synthesis techniques have attracted much attention because of their superior ability for mapping. However, most the current methods require large scales paired data, which greatly limits usage. Efforts been made to relax a restriction, cycle-consistent adversarial networks...
Multi-modal- image-text models such as CLIP and LiT have demonstrated impressive performance on image classification benchmarks their zero-shot generalization ability is particularly exciting. While the top-5 accuracies of these are very high, top-1 much lower (over 25% gap in some cases). We investigate reasons for this find that many failure cases caused by ambiguity text prompts. First, we develop a simple efficient post-hoc method to identify images whose prediction likely be incorrect,...
Computerized breast cancer diagnosis system has played an import role in early diagnosis. For this purpose, we apply deep learning by using convolutional neural networks (CNN) to classify abnormalities, benign or malignant, mammographic images based on the mini Mammographic Image Analysis Society (mini-MIAS) database. Accuracy, sensitivity, and specific values are observed evaluate performance of CNN. To improve performance, utilize image-preprocessing methods containing cropping, global...
MR to CT image synthesis plays an important role in medical analysis, and its applications included, but not limited PET-MR attenuation correction only radiation therapy planning. Recently, deep learning-based techniques have achieved much success. However, most of the current methods require large scales paired data from two different modalities, which greatly limits their usage as some situation is infeasible obtain. Some efforts been proposed relax this constraint such cycle-consistent...
In this paper, a deep learning computer aided diagnosis system (CADs) is proposed for automatic segmentation and classification of melanoma lesions, containing fully convolutional neural network (FCN) specific (CNN). FCN, which consists 28-layer structure, designed with mask region interest (ROI) as its output. Later, the CNN only uses segmented ROI raw image to extract features, while DLCM statistical contrast location features extracted from same are merged into features. Finally, combined...
The human brain is the gold standard of adaptive learning. It not only can learn and benefit from experience, but also adapt to new situations. In contrast, deep neural networks one sophisticated fixed mapping inputs outputs. This limits their applicability more dynamic situations, where input output may change with different contexts. A salient example continual learning-learning independent tasks sequentially without forgetting previous tasks. Continual learning multiple in artificial...
In Lifelong Learning (LL), agents continually learn as they encounter new conditions and tasks. Most current LL is limited to a single agent that learns tasks sequentially. Dedicated machinery then deployed mitigate the forgetting of old are learned. This inherently slow. We propose Shared Knowledge (SKILL) challenge, which deploys decentralized population each sequentially different tasks, with all operating independently in parallel. After learning their respective share consolidate...
Continual learning aims to emulate the human ability continually accumulate knowledge over sequential tasks. The main challenge is maintain performance on previously learned tasks after new tasks, i.e., avoid catastrophic forgetting. We propose a Channel-wise Lightweight Reprogramming (CLR) approach that helps convolutional neural networks (CNNs) overcome forgetting during continual learning. show CNN model trained an old task (or self-supervised proxy task) could be "reprogrammed" solve by...
Unlike static gesture recognition, a novel real-time prediction system in this study can judge the intention of hand motion and predict exact final before end movement. Flex sensors are used to measure comprehensive data glove, which positioned based on biological muscle distribution characteristics hand. Position, velocity acceleration information extracted from raw while adjacent finger-coupling features also obtained by processing position information. After such as windowing filtering,...
We focus on controllable disentangled representation learning (C-Dis-RL), where users can control the partition of latent space to factorize dataset attributes (concepts) for downstream tasks. Two general problems remain under-explored in current methods: (1) They lack comprehensive disentanglement constraints, especially missing minimization mutual information between different across and observation domains. (2) convexity which is important meaningfully manipulating specific To encourage...
We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-to-image synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach1 decouples generation into foreground object generation, and contextually coherent background generation. To objects, we employ straightforward textual template, incorporating class name as input prompts. This is fed framework, producing various images set against isolated backgrounds. A...
Compared with the traditional mechanical beam deflector in a beam-scanning system, dual-wedge scanning system has several advantages, for example, compact structure, fast speed, and low power consumption. High accuracy is most important factor scanning, but errors caused by machining or assembly adversely affect this accuracy. Horizontal angular appear between incident light central optical axes. By building mathematical model of an ideal trajectory affected errors, paper analyzes types...
We propose a new paradigm to automatically generate training data with accurate labels at scale using the text-toimage synthesis frameworks (e.g., DALL-E, Stable Diffusion, etc.). The proposed approach decouples generation into foreground object mask and background (context) image generation. For generation, we use simple textual template class name as input DALL-E diverse set of images. A foreground-background segmentation algorithm is then used masks. Next, in order context images, first...
Oil sheen on the water surface can indicate a source of hydrocarbon in underlying subaquatic sediments. Here, we develop and test accuracy an algorithm for automated real-time visual monitoring detecting oil sheen. This detection system is part screening (OS-SS) that disturbs sediments monitors formation We first created new near-surface image dataset. then used this dataset to image-based Sheen Prediction Neural Network (OS-Net), classification machine learning model based convolutional...
A major challenge in monocular 3D object detection is the limited diversity and quantity of objects real datasets. While augmenting scenes with virtual holds promise to improve both objects, it remains elusive due lack an effective insertion method complex captured scenes. In this work, we study indoor for detection. The main automatically identify plausible physical properties assets (e.g., locations, appearances, sizes, etc.) cluttered To address challenge, propose a physically approach...