Yuanchun Zhou

ORCID: 0000-0003-2144-1131
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Advanced Graph Neural Networks
  • Data Management and Algorithms
  • Scientific Computing and Data Management
  • Recommender Systems and Techniques
  • Domain Adaptation and Few-Shot Learning
  • Distributed and Parallel Computing Systems
  • Biomedical Text Mining and Ontologies
  • Human Mobility and Location-Based Analysis
  • Text and Document Classification Technologies
  • Data-Driven Disease Surveillance
  • Single-cell and spatial transcriptomics
  • Bioinformatics and Genomic Networks
  • Advanced Image Processing Techniques
  • Machine Learning in Bioinformatics
  • Gene expression and cancer classification
  • Genomics and Phylogenetic Studies
  • Cloud Computing and Resource Management
  • Advanced Text Analysis Techniques
  • Advanced Image Fusion Techniques
  • Complex Network Analysis Techniques
  • Graph Theory and Algorithms
  • Species Distribution and Climate Change
  • Machine Learning and Data Classification
  • Web Data Mining and Analysis

Computer Network Information Center
2016-2025

Chinese Academy of Sciences
2016-2025

University of Chinese Academy of Sciences
2019-2025

Institute for Advanced Study
2024

Academia Sinica
2024

University of Science and Technology of China
1987-2023

Beijing Institute of Big Data Research
2019-2020

Nanjing University of Finance and Economics
2019

State Key Laboratory of Pollution Control and Resource Reuse
2016-2019

Nanjing University
2016-2019

Background Qinghai Lake in central China has been at the center of debate on whether wild birds play a role circulation highly pathogenic avian influenza virus H5N1. In 2005, an unprecedented epizootic killed more than 6000 migratory including over 3000 bar-headed geese (Anser indicus). H5N1 subsequently spread to Europe and Africa, following years re-emerged along Central Asia flyway several times. Methodology/Principal Findings To better understand potential involvement H5N1, we studied...

10.1371/journal.pone.0017622 article EN cc-by PLoS ONE 2011-03-09

Abstract Taxonomic and functional research of microorganisms has increasingly relied upon genome-based data methods. As the depository Global Catalogue Microorganisms (GCM) 10K prokaryotic type strain sequencing project, Type Strain (gcType) published 1049 genomes sequenced by GCM project which are preserved in global culture collections with a valid status. Additionally, information provided through gcType includes >12 000 publicly available genome sequences from GenBank incorporated...

10.1093/nar/gkaa957 article EN cc-by Nucleic Acids Research 2020-10-28

The distribution shift in Time Series Forecasting (TSF), indicating series changes over time, largely hinders the performance of TSF models. Existing works towards time are mostly limited quantification and, more importantly, overlook potential between lookback and horizon windows. To address above challenges, we systematically summarize into two categories. Regarding windows as input-space output-space, there exist (i) intra-space shift, that within keeps shifted (ii) inter-space is...

10.1609/aaai.v37i6.25914 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

Abstract Deciphering the universal gene regulatory mechanisms in diverse organisms holds great potential to advance our knowledge of fundamental life process and facilitate research on clinical applications. However, traditional paradigm primarily focuses individual model organisms, resulting limited collection integration complex features various cell types across species. Recent breakthroughs single-cell sequencing advancements deep learning techniques present an unprecedented opportunity...

10.1101/2023.09.26.559542 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2023-09-28

Abstract Deciphering universal gene regulatory mechanisms in diverse organisms holds great potential for advancing our knowledge of fundamental life processes and facilitating clinical applications. However, the traditional research paradigm primarily focuses on individual model does not integrate various cell types across species. Recent breakthroughs single-cell sequencing deep learning techniques present an unprecedented opportunity to address this challenge. In study, we built extensive...

10.1038/s41422-024-01034-y article EN cc-by Cell Research 2024-10-08

The field of catalysis holds paramount importance in shaping the trajectory sustainable development, prompting intensive research efforts to leverage artificial intelligence (AI) catalyst design. Presently, fine-tuning open-source large language models (LLMs) has yielded significant breakthroughs across various domains such as biology and healthcare. Drawing inspiration from these advancements, we introduce CataLM (Catalytic Language Model), a model tailored domain electrocatalytic...

10.1007/s13042-024-02473-0 article EN cc-by-nc-nd International Journal of Machine Learning and Cybernetics 2025-01-15

Data augmentation is a series of techniques that generate high-quality artificial data by manipulating existing samples. By leveraging techniques, AI models can achieve significantly improved applicability in tasks involving scarce or imbalanced datasets, thereby substantially enhancing models' generalization capabilities. Existing literature surveys only focus on certain type specific modality data, and categorize these methods from modality-specific operation-centric perspectives, which...

10.48550/arxiv.2405.09591 preprint EN arXiv (Cornell University) 2024-05-15

Single Image Super-resolution (SISR) produces high-resolution images with fine spatial resolutions from aremotely sensed image low resolution. Recently, deep learning and generative adversarial networks(GANs) have made breakthroughs for the challenging task of single super-resolution (SISR). However, thegenerated still suffers undesirable artifacts such as, absence texture-feature representationand high-frequency information. We propose a frequency domain-based spatio-temporal remote...

10.1145/3456726 article EN ACM Transactions on Intelligent Systems and Technology 2021-12-20

Multimodal medical images are widely used by clinicians and physicians to analyze retrieve complementary information from high-resolution in a non-invasive manner. Loss of corresponding image resolution adversely affects the overall performance interpretation. Deep learning-based single super (SISR) algorithms have revolutionized diagnosis framework continually improving architectural components training strategies associated with convolutional neural networks (CNN) on low-resolution images....

10.1109/tcbb.2022.3191387 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022-07-18

The representation of feature space is a crucial environment where data points get vectorized and embedded for upcoming modeling. Thus the efficacy machine learning (ML) algorithms closely related to quality engineering. As one most important techniques, generation transforms raw into an optimized conducive model training further refines space. Despite advancements in automated engineering generation, current methodologies often suffer from three fundamental issues: lack explainability,...

10.48550/arxiv.2406.03505 preprint EN arXiv (Cornell University) 2024-06-04

The objective of topic inference in research proposals aims to obtain the most suitable disciplinary division from discipline system defined by a funding agency. agency will subsequently find appropriate peer review experts their database based on this division. Automated can reduce human errors caused manual filling, bridge knowledge gap between agencies and project applicants, improve efficiency. Existing methods focus modeling as hierarchical multi-label classification problem, using...

10.1145/3671149 article EN ACM Transactions on Knowledge Discovery from Data 2024-06-08

Background Rabies is a significant public health problem in China that it records the second highest case incidence globally. Surveillance data on canine rabies lacking and human notifications can be useful indicator of areas where animal control could integrated. Previous spatial epidemiological studies lacked adequate resolution to inform targeted decisions. We aimed describe spatiotemporal distribution model its geographical spread provide an evidence base future integrated strategies...

10.1371/journal.pone.0072352 article EN cc-by PLoS ONE 2013-08-26

What is the purpose of a trip? are unique human mobility patterns and spatial contexts in or near pickup points delivery trajectories for specific trip purpose? Many prior studies have modeled urban regions; however, these analytics mainly focus on interpreting semantic meanings geographic topics at an aggregate level. Given lack information about activities pick-up dropoff points, it challenging to convert into effective tools inferring purposes. To address this challenge, article, we study...

10.1145/3078849 article EN ACM Transactions on Intelligent Systems and Technology 2017-12-11

Machine learning technology is becoming increasingly prevalent in the petroleum industry, especially for reservoir characterization and drilling problems. The aim of this study to present an alternative way predict water saturation distribution reservoirs with a machine method. In study, we utilized Long Short-Term Memory (LSTM) build prediction model forecast distribution. dataset deriving from monitoring simulating actual was training testing. data after validated distribution, pressure...

10.3390/en12193597 article EN cc-by Energies 2019-09-20

Low-resolution medical images can seriously interfere with the diagnosis, and poor image quality lead to loss of detailed information. Therefore, improving accelerating reconstruction is particular importance for diagnosis. To solve this problem, we propose a wavelet-based mini-grid network super-resolution (WMSR) method, which similar three-layer hidden-layer-based convolutional neural (SRCNN) method. Due amplification characteristics wavelets, stationary wavelet transform (SWT) used...

10.1109/access.2020.2974278 article EN cc-by IEEE Access 2020-01-01

Despite massive research in deep learning, the human activity recognition (HAR) domain still suffers from key challenges terms of accurate classification and detection. The core idea behind recognizing activities accurately is to assist Internet-of-things (IoT) enabled smart surveillance systems. Thereby, this work based on joint use discrete wavelet transform (DWT) recurrent neural network (RNN) classify detect accurately. Recent approaches HAR exploit three-dimensional (3-D) convolutional...

10.1109/tfuzz.2022.3152106 article EN IEEE Transactions on Fuzzy Systems 2022-02-16

The electrocatalytic CO2 reduction process has gained enormous attention for both environmental protection and chemicals production. Thereinto, the design of new electrocatalysts with high activity selectivity can draw inspiration from abundant scientific literature. An annotated verified corpus made massive literature assist development natural language processing (NLP) models, which offer insight to help guide understanding these underlying mechanisms. To facilitate data mining in this...

10.1038/s41597-023-02089-z article EN cc-by Scientific Data 2023-03-29

Drug-drug interaction (DDI) prediction can discover potential risks of drug combinations in advance by detecting pairs that are likely to interact with each other, sparking an increasing demand for computational methods DDI prediction. However, existing mostly rely on the single-view paradigm, failing handle complex features and intricate patterns DDIs due limited expressiveness single view. To this end, we propose a Hierarchical Triple-view Contrastive Learning framework Drug-Drug...

10.1093/bib/bbad324 article EN Briefings in Bioinformatics 2023-09-04

Abstract Here, we present the manually curated Global Catalogue of Pathogens (gcPathogen), an extensive genomic resource designed to facilitate rapid and accurate pathogen analysis, epidemiological exploration monitoring antibiotic resistance features virulence factors. The catalogue seamlessly integrates analyzes data associated metadata for human pathogens isolated from infected patients, animal hosts, food environment. list is supported by evidence medical or government pathogenic lists...

10.1093/nar/gkad875 article EN cc-by Nucleic Acids Research 2023-10-18

Selecting appropriate values for the configurable knobs of Database Management Systems (DBMS) is crucial to improve performance. But because such complexity has surpassed abilities even best human experts, database community turns machine learning (ML)-based automatic tuning systems. However, these systems still incur significant costs or only yield sub-optimal performance, attributable their overly high reliance on black-box optimization and an oversight domain knowledge. This paper...

10.1145/3626246.3654739 article EN 2024-05-23

Abstract The object identification within an image captured during rough weather conditions (such as haze, fog) poses difficulty due to the reduction of image. lead not only variation image's visual effect but also disadvantage post‐processing Furthermore, it causes inconvenience all types instruments that rely on optical imaging, such satellite remote‐sensing systems, aerial photo outdoor monitoring and respectively. Hence, improvement restorement effects enhanced are needed. This research...

10.1049/ipr2.12004 article EN cc-by IET Image Processing 2020-12-04

Class-Incremental Learning (CIL) aims to train a reliable model with the streaming data, which emerges unknown classes sequentially. Different from traditional closed set learning, CIL has two main challenges: (1) Novel class detection. The initial training data only contains incomplete classes, and test will accept classes. Therefore, needs not accurately classify known but also effectively detect classes; (2) Model expansion. After novel are detected, be updated without re-training using...

10.1109/tkde.2021.3109131 article EN IEEE Transactions on Knowledge and Data Engineering 2021-01-01
Coming Soon ...