Hao Chen

ORCID: 0000-0002-3424-835X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Single-cell and spatial transcriptomics
  • Immune cells in cancer
  • Topic Modeling
  • Bioinformatics and Genomic Networks
  • RNA and protein synthesis mechanisms
  • Natural Language Processing Techniques
  • Gene expression and cancer classification
  • Molecular Biology Techniques and Applications
  • Telomeres, Telomerase, and Senescence
  • Adversarial Robustness in Machine Learning
  • RNA modifications and cancer
  • Phytochemicals and Antioxidant Activities
  • Anomaly Detection Techniques and Applications
  • Food Quality and Safety Studies
  • Genetics, Aging, and Longevity in Model Organisms
  • Computational Drug Discovery Methods
  • Energy Load and Power Forecasting
  • Tea Polyphenols and Effects
  • RNA Research and Splicing
  • Biomedical Text Mining and Ontologies
  • Electrochemical sensors and biosensors
  • Brain Tumor Detection and Classification
  • Natural product bioactivities and synthesis
  • Intelligent Tutoring Systems and Adaptive Learning
  • Cardiomyopathy and Myosin Studies

Carnegie Mellon University
2022-2025

University of Illinois Urbana-Champaign
2025

University of Illinois Chicago
2025

Genomics Institute of the Novartis Research Foundation
2023-2024

John Jay College of Criminal Justice
2023-2024

Qingdao Agricultural University
2024

University of California, Riverside
2021-2024

Nanjing Maternity and Child Health Care Hospital
2024

Nanjing Medical University
2024

Tea Research Institute
2024

10.1038/s43587-022-00326-5 article EN Nature Aging 2022-12-20

Abstract Many machine learning applications in bioinformatics currently rely on matching gene identities when analyzing input signatures and fail to take advantage of preexisting knowledge about functions. To further enable comparative analysis OMICS datasets, including target deconvolution mechanism action studies, we develop an approach that represents projected onto their biological functions, instead identities, similar how the word2vec technique works natural language processing. We...

10.1038/s41467-024-46089-y article EN cc-by Nature Communications 2024-02-29

10.1109/cvprw63382.2024.00162 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2024-06-17

Abstract Background Facioscapulohumeral muscular dystrophy (FSHD) is a high-prevalence autosomal dominant neuromuscular disease characterized by significant clinical and genetic heterogeneity. Genetic diagnosis of FSHD remains challenge because it cannot be detected standard sequencing methods requires complex workflow. Methods We developed comprehensive detection method based on Oxford Nanopore Technologies (ONT) whole-genome sequencing. Using case–control design, we applied this procedure...

10.1186/s12967-024-05259-8 article EN cc-by Journal of Translational Medicine 2024-05-13

Despite the rapid growth of context length large language models (LLMs) , LLMs still perform poorly in long document summarization. An important reason for this is that relevant information about an event scattered throughout documents, and messy narrative order impairs accurate understanding utilization documents. To address these issues, we propose a novel summary generation framework, called HERA. Specifically, first segment by its semantic structure retrieve text segments same event,...

10.48550/arxiv.2502.00448 preprint EN arXiv (Cornell University) 2025-02-01

Spatial transcriptomics promises to greatly improve our understanding of tissue organization and cell-cell interactions. While most current platforms for spatial only offer multi-cellular resolution, with 10-15 cells per spot, recent technologies provide a much denser spot placement leading sub-cellular resolution. A key challenge these newer methods is cell segmentation the assignment spots cells. Traditional image-based are limited do not make full use information profiled by...

10.1101/2023.01.11.523658 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-01-15

Alternative splicing plays a crucial role in protein diversity and gene expression regulation higher eukaryotes, mutations causing dysregulated underlie range of genetic diseases. Computational prediction alternative from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing drug targets. However, the current methods for quantitative splice site usage still have limited accuracy. Here, we present DeltaSplice, deep neural network...

10.1101/gr.279044.124 article EN cc-by-nc Genome Research 2024-07-01

Recent advances in event-based research prioritize sparsity and temporal precision. Approaches learning sparse point-based representations through graph CNNs (GCN) become more popular. Yet, these techniques hold lower performance than their frame-based counterpart due to two issues: (i) Biased structures that don't properly incorporate varied attributes (such as semantics, spatial signals) for each vertex, resulting inaccurate representations. (ii) A shortage of robust pretrained models....

10.1609/aaai.v38i2.27914 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Sooty mold (SM), caused by Cladosporium species, is a pervasive threat to tea plant health, affecting both canopy structure and crop yield. Despite its significance, understanding the complex interplay between defense genes metabolites in plants across various SM-infected layers remains limited. Our study employed hyperspectral imaging, transcriptomic profiling, metabolomic analysis decipher intricate mechanisms underlying plant's response SM infection. imaging identified three critical...

10.1186/s12870-024-05806-x article EN cc-by-nc-nd BMC Plant Biology 2024-11-15

Abstract Many machine learning applications in bioinformatics currently rely on gene identities extracted from input signatures, and fail to take advantage of preexisting knowledge about functions. We developed the Functional Representation Gene Signatures (FRoGS) approach by training a deep model. FRoGS represents signatures projected onto their biological functions, instead identities, similar how word2vec technique works natural language processing. demonstrated that its application L1000...

10.21203/rs.3.rs-3371688/v1 preprint EN cc-by Research Square (Research Square) 2023-09-28

Abstract Annotating the functions of gene products is a mainstay in biology. A variety databases have been established to record functional knowledge at level. However, annotations isoform resolution are great demand many biological applications. Although critical information processes such as protein–protein interactions (PPIs) often used study functions, it does not directly help differentiate isoforms, ‘proteins’ existing PPIs generally refer ‘genes’. On other hand, prediction and...

10.1093/nargab/lqab057 article EN cc-by-nc NAR Genomics and Bioinformatics 2021-04-09

Abstract Alternative splicing plays a crucial role in protein diversity and gene expression regulation higher eukaryotes mutations causing dysregulated underlie range of genetic diseases. Computational prediction alternative from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing drug targets. However, the current methods for quantitative splice site usage still have limited accuracy. Here, we present DeltaSplice, deep neural...

10.1101/2024.03.22.586363 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-03-25

<title>Abstract</title> <bold>Background</bold> Sooty mold (SM) is one of the most destructive diseases tea plants, causing considerable damage and productivity losses. However, roles defense genes metabolites in different SM-infected canopy layers plants remain largely unclear. To investigate immune mechanisms we utilized hyperspectral, transcriptomic, metabolomic data from leaves three infected by SM (A1, A2, A3). <bold>Results</bold> The hyperspectral analysis indicated that spectral...

10.21203/rs.3.rs-5075569/v1 preprint EN cc-by Research Square (Research Square) 2024-10-31

10.18653/v1/2024.emnlp-main.333 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

Pre-trained vision-language models (VLMs) have showcased remarkable performance in image and natural language understanding, such as captioning response generation. As the practical applications of become increasingly widespread, their potential safety robustness issues raise concerns that adversaries may evade system cause these to generate toxic content through malicious attacks. Therefore, evaluating open-source VLMs against adversarial attacks has garnered growing attention, with...

10.48550/arxiv.2411.15720 preprint EN arXiv (Cornell University) 2024-11-24
Coming Soon ...