- Single-cell and spatial transcriptomics
- Immune cells in cancer
- Topic Modeling
- Bioinformatics and Genomic Networks
- RNA and protein synthesis mechanisms
- Natural Language Processing Techniques
- Gene expression and cancer classification
- Molecular Biology Techniques and Applications
- Telomeres, Telomerase, and Senescence
- Adversarial Robustness in Machine Learning
- RNA modifications and cancer
- Phytochemicals and Antioxidant Activities
- Anomaly Detection Techniques and Applications
- Food Quality and Safety Studies
- Genetics, Aging, and Longevity in Model Organisms
- Computational Drug Discovery Methods
- Energy Load and Power Forecasting
- Tea Polyphenols and Effects
- RNA Research and Splicing
- Biomedical Text Mining and Ontologies
- Electrochemical sensors and biosensors
- Brain Tumor Detection and Classification
- Natural product bioactivities and synthesis
- Intelligent Tutoring Systems and Adaptive Learning
- Cardiomyopathy and Myosin Studies
Carnegie Mellon University
2022-2025
University of Illinois Urbana-Champaign
2025
University of Illinois Chicago
2025
Genomics Institute of the Novartis Research Foundation
2023-2024
John Jay College of Criminal Justice
2023-2024
Qingdao Agricultural University
2024
University of California, Riverside
2021-2024
Nanjing Maternity and Child Health Care Hospital
2024
Nanjing Medical University
2024
Tea Research Institute
2024
Abstract Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input signatures and fail to take advantage of preexisting knowledge about functions. To further enable comparative analysis OMICS datasets, including target deconvolution mechanism action studies, we develop an approach that represents projected onto their biological functions, instead identities, similar how the word2vec technique works natural language processing. We...
Abstract Background Facioscapulohumeral muscular dystrophy (FSHD) is a high-prevalence autosomal dominant neuromuscular disease characterized by significant clinical and genetic heterogeneity. Genetic diagnosis of FSHD remains challenge because it cannot be detected standard sequencing methods requires complex workflow. Methods We developed comprehensive detection method based on Oxford Nanopore Technologies (ONT) whole-genome sequencing. Using case–control design, we applied this procedure...
Despite the rapid growth of context length large language models (LLMs) , LLMs still perform poorly in long document summarization. An important reason for this is that relevant information about an event scattered throughout documents, and messy narrative order impairs accurate understanding utilization documents. To address these issues, we propose a novel summary generation framework, called HERA. Specifically, first segment by its semantic structure retrieve text segments same event,...
Spatial transcriptomics promises to greatly improve our understanding of tissue organization and cell-cell interactions. While most current platforms for spatial only offer multi-cellular resolution, with 10-15 cells per spot, recent technologies provide a much denser spot placement leading sub-cellular resolution. A key challenge these newer methods is cell segmentation the assignment spots cells. Traditional image-based are limited do not make full use information profiled by...
Alternative splicing plays a crucial role in protein diversity and gene expression regulation higher eukaryotes, mutations causing dysregulated underlie range of genetic diseases. Computational prediction alternative from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing drug targets. However, the current methods for quantitative splice site usage still have limited accuracy. Here, we present DeltaSplice, deep neural network...
Recent advances in event-based research prioritize sparsity and temporal precision. Approaches learning sparse point-based representations through graph CNNs (GCN) become more popular. Yet, these techniques hold lower performance than their frame-based counterpart due to two issues: (i) Biased structures that don't properly incorporate varied attributes (such as semantics, spatial signals) for each vertex, resulting inaccurate representations. (ii) A shortage of robust pretrained models....
Sooty mold (SM), caused by Cladosporium species, is a pervasive threat to tea plant health, affecting both canopy structure and crop yield. Despite its significance, understanding the complex interplay between defense genes metabolites in plants across various SM-infected layers remains limited. Our study employed hyperspectral imaging, transcriptomic profiling, metabolomic analysis decipher intricate mechanisms underlying plant's response SM infection. imaging identified three critical...
Abstract Many machine learning applications in bioinformatics currently rely on gene identities extracted from input signatures, and fail to take advantage of preexisting knowledge about functions. We developed the Functional Representation Gene Signatures (FRoGS) approach by training a deep model. FRoGS represents signatures projected onto their biological functions, instead identities, similar how word2vec technique works natural language processing. demonstrated that its application L1000...
Abstract Annotating the functions of gene products is a mainstay in biology. A variety databases have been established to record functional knowledge at level. However, annotations isoform resolution are great demand many biological applications. Although critical information processes such as protein–protein interactions (PPIs) often used study functions, it does not directly help differentiate isoforms, ‘proteins’ existing PPIs generally refer ‘genes’. On other hand, prediction and...
Abstract Alternative splicing plays a crucial role in protein diversity and gene expression regulation higher eukaryotes mutations causing dysregulated underlie range of genetic diseases. Computational prediction alternative from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing drug targets. However, the current methods for quantitative splice site usage still have limited accuracy. Here, we present DeltaSplice, deep neural...
<title>Abstract</title> <bold>Background</bold> Sooty mold (SM) is one of the most destructive diseases tea plants, causing considerable damage and productivity losses. However, roles defense genes metabolites in different SM-infected canopy layers plants remain largely unclear. To investigate immune mechanisms we utilized hyperspectral, transcriptomic, metabolomic data from leaves three infected by SM (A1, A2, A3). <bold>Results</bold> The hyperspectral analysis indicated that spectral...
Pre-trained vision-language models (VLMs) have showcased remarkable performance in image and natural language understanding, such as captioning response generation. As the practical applications of become increasingly widespread, their potential safety robustness issues raise concerns that adversaries may evade system cause these to generate toxic content through malicious attacks. Therefore, evaluating open-source VLMs against adversarial attacks has garnered growing attention, with...