- Generative Adversarial Networks and Image Synthesis
- Epigenetics and DNA Methylation
- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Natural Language Processing Techniques
- Acute Myeloid Leukemia Research
- Genomics and Chromatin Dynamics
- Advanced Neural Network Applications
- Topic Modeling
- CRISPR and Genetic Engineering
- Acute Lymphoblastic Leukemia research
- Adversarial Robustness in Machine Learning
- Pluripotent Stem Cells Research
- Anomaly Detection Techniques and Applications
- Protein Degradation and Inhibitors
- RNA Research and Splicing
- RNA modifications and cancer
- Machine Learning and Data Classification
- 3D Shape Modeling and Analysis
- Digital Media Forensic Detection
- Advanced Image and Video Retrieval Techniques
- Gaussian Processes and Bayesian Inference
- Chronic Myeloid Leukemia Treatments
- Data-Driven Disease Surveillance
- Music and Audio Processing
St. Jude Children's Research Hospital
2016-2025
Qinghai University
2024
Qinghai Provincial Peoples Hospital
2018-2024
Google (United States)
2020-2023
Carnegie Mellon University
2015-2021
Soochow University
2021
Shandong Normal University
2020
Shandong Jianzhu University
2020
University of South Australia
2019
Microsoft Research (United Kingdom)
2018
Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance. In this paper, we demonstrate the power simple combination two common SSL methods: consistency regularization and pseudo-labeling. Our algorithm, FixMatch, first generates pseudo-labels using predictions on weakly-augmented images. For given image, pseudo-label is only retained if model produces high-confidence prediction. The then trained predict when fed...
We aim at constructing a high performance model for defect detection that detects unknown anomalous patterns of an image without data. To this end, we propose two-stage framework building anomaly detectors using normal training data only. first learn self-supervised deep representations and then build generative one-class classifier on learned representations. by classifying from the CutPaste, simple augmentation strategy cuts patch pastes random location large image. Our empirical study...
While deep learning methods have achieved state-of-theart performance in many challenging inverse problems like image inpainting and super-resolution, they invariably involve problem-specific training of the networks. Under this approach, each problem requires its own dedicated network. In scenarios where we need to solve a wide variety problems, e.g., on mobile camera, it is inefficient expensive use these On other hand, traditional using analytic signal priors can be used any linear...
Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine models using unlabeled data. Although there been remarkable recent progress, scope demonstration in SSL mainly on image classification tasks. In this paper, we propose STAC, simple yet effective framework for visual object detection along with data augmentation strategy. STAC deploys highly confident pseudo labels localized objects from an and updates model by enforcing consistency via strong...
Generative moment matching network (GMMN) is a deep generative model that differs from Adversarial Network (GAN) by replacing the discriminator in GAN with two-sample test based on kernel maximum mean discrepancy (MMD). Although some theoretical guarantees of MMD have been studied, empirical performance GMMN still not as competitive challenging and large benchmark datasets. The computational efficiency also less desirable comparison GAN, partially due to its requirement for rather batch size...
Cheng-Yu Hsieh, Chun-Liang Li, Chih-kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alex Ratner, Ranjay Krishna, Chen-Yu Lee, Tomas Pfister. Findings of the Association for Computational Linguistics: ACL 2023.
Real-world time-series datasets are often multivariate with complex dynamics. To capture this complexity, high capacity architectures like recurrent- or attention-based sequential deep learning models have become popular. However, recent work demonstrates that simple univariate linear can outperform such on several commonly used academic benchmarks. Extending them, in paper, we investigate the capabilities of for forecasting and present Time-Series Mixer (TSMixer), a novel architecture...
Direct reprogramming of human somatic cells into pluripotency has broad implications in generating patient-specific induced pluripotent stem (iPS) for disease modeling and cellular replacement therapies. However, the low efficiency safety issues associated with generation iPS have limited their usage clinical settings. Cell types can significantly influence kinetics. To date, been obtained only from a few cell types. Here, we report first time rapid efficient amniotic fluid-derived (hAFDCs)...
Galaxy-scale strong gravitational lensing is not only a valuable probe of the dark matter distribution massive galaxies, but can also provide cosmological constraints, either by studying population lenses or measuring time delays in lensed quasars. Due to rarity galaxy-scale strongly systems, fast and reliable automated lens finding methods will be essential era large surveys such as LSST, Euclid, WFIRST. To tackle this challenge, we introduce CMU DeepLens, new fully galaxy-galaxy method...
Numerous pieces of evidence support the complex, 3D spatial organization genome dictates gene expression. CTCF is essential to define topologically associated domain boundaries and facilitate formation insulated chromatin loop structures. To understand CTCF's direct role in global transcriptional regulation, we integrated miniAID-mClover3 cassette endogenous locus a human pediatric B-ALL cell line, SEM, an immortal erythroid precursor HUDEP-2, allow for acute depletion protein by...
Human explanations of high-level decisions are often expressed in terms key concepts the based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, define notion completeness, which quantifies how sufficient a particular set is explaining model's prediction behavior on assumption that complete concept scores statistics model prediction. Next, propose discovery method aims to infer additionally encouraged be interpretable, addresses limitations...
We present a two-stage framework for deep one-class classification. first learn self-supervised representations from data, and then build classifiers on learned representations. The not only allows to better representations, but also permits building that are faithful the target task. argue inspired by statistical perspective in generative or discriminative models more effective than existing approaches, such as normality score surrogate classifier. thoroughly evaluate different...
We established a genome-wide compendium of somatic mutation events in 3949 whole cancer genomes representing 19 tumor types. Protein-coding captured well-established drivers. Noncoding near tissue-specific genes, such as ALB the liver or KLK3 prostate, characterized localized passenger patterns and may reflect tumor-cell-of-origin imprinting. regulatory promoter enhancer regions frequently involved cancer-relevant genes BCL6 , FGFR2 RAD51B SMC6 TERT XBP1 represent possible Unlike most...
Chen-Yu Lee, Chun-Liang Li, Timothy Dozat, Vincent Perot, Guolong Su, Nan Hua, Joshua Ainslie, Renshen Wang, Yasuhisa Fujii, Tomas Pfister. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.
In Composed Image Retrieval (CIR), a user combines query image with text to describe their intended target. Existing methods rely on supervised learning of CIR models using labeled triplets consisting the image, specification, and target image. Labeling such is expensive hinders broad applicability CIR. this work, we propose study an important task, Zero-Shot (ZS-CIR), whose goal build model without requiring for training. To end, novel method, called Pic2Word, that requires only weakly...
Although self-/un-supervised methods have led to rapid progress in visual representation learning, these generally treat objects and scenes using the same lens. In this paper, we focus on learning representations for that preserve structure among them. Motivated by observation visually similar are close space, argue should instead follow a hierarchical based their compositionality. To exploit such structure, propose contrastive framework where Euclidean loss is used learn object hyperbolic...
Summary Objective Genetic alterations in four oncogenes, namely RAS point mutations, RET rearrangements ( /PTC), NTRK1 TRK ) and BRAF mutations have been identified human papillary thyroid carcinomas (PTCs). These oncogenes act along the RET/PTC(TRK)–RAS–BRAF–MEK–MAPK kinase pathway, mediating a number of cellular fates including growth, proliferation survival cells. In this study, we analysed cohort PTCs. Methods To screen for genomic DNA 105 PTCs were amplified by polymerase chain reaction...
State-of-the-art pedestrian detection models have achieved great success in many benchmarks. However, these require lots of annotation information and the labeling process usually takes much time efforts. In this paper, we propose a method to generate labeled data adapt them support training detectors. The proposed framework is built on Generative Adversarial Network (GAN) with multiple discriminators, trying synthesize realistic pedestrians learn background context simultaneously. To handle...