- Recommender Systems and Techniques
- Machine Learning in Healthcare
- Model Reduction and Neural Networks
- Advanced Graph Neural Networks
- Topic Modeling
- Gene expression and cancer classification
- Stock Market Forecasting Methods
- Probabilistic and Robust Engineering Design
- Single-cell and spatial transcriptomics
- Time Series Analysis and Forecasting
- Imbalanced Data Classification Techniques
- Artificial Intelligence in Healthcare
- Intelligent Tutoring Systems and Adaptive Learning
- Acoustic Wave Phenomena Research
- Metamaterials and Metasurfaces Applications
- Forecasting Techniques and Applications
- Financial Distress and Bankruptcy Prediction
- Data Management and Algorithms
- Qualitative Comparative Analysis Research
- Human Mobility and Location-Based Analysis
- Membrane Separation Technologies
- Fuel Cells and Related Materials
- Machine Learning in Materials Science
- Explainable Artificial Intelligence (XAI)
- Graph Theory and Algorithms
Carnegie Mellon University
2022-2025
Shandong Institute of Automation
2025
Chinese Academy of Sciences
2025
Tsinghua University
2025
Chinese University of Hong Kong
2025
Shaanxi Normal University
2022
University of Washington
2020-2022
Seattle University
2022
Huazhong University of Science and Technology
2020
Hebei University
2015
Partial differential equations (PDEs) are concise and understandable representations of domain knowledge, which essential for deepening our understanding physical processes predicting future responses. However, the PDEs many real-world problems uncertain, calls PDE discovery. We propose symbolic genetic algorithm (SGA-PDE) to discover open-form directly from data without prior knowledge about equation structure. SGA-PDE focuses on representation optimization PDE. Firstly, uses mathematics...
Sequential recommendation (SR) aims to model users dynamic preferences from a series of interactions. A pivotal challenge in user modeling for SR lies the inherent variability preferences. An effective is expected capture both long-term and short-term exhibited by users, wherein former can offer comprehensive understanding stable interests that impact latter. To more effectively such information, we incorporate locality inductive bias into Transformer amalgamating its global attention...
The next location recommendation is at the core of various location-based applications. Current state-of-the-art models have attempted to solve spatial sparsity with hierarchical gridding and model temporal relation explicit time intervals, while some vital questions remain unsolved. Non-adjacent locations non-consecutive visits provide non-trivial correlations for understanding a user's behavior but were rarely considered. To aggregate all relevant from user trajectory recall most plausible...
Machine learning systems are notoriously prone to biased predictions about certain demographic groups, leading algorithmic fairness issues.Due privacy concerns and data quality problems, some information may not be available in the training complex interaction of different demographics can lead a lot unknown minority subpopulations, which all limit applicability group fairness.Many existing works on without assume correlation between groups features.However, we argue that model gradients...
In quantum mechanics, a norm-squared wave function can be interpreted as the probability density that describes likelihood of particle to measured in given position or momentum. This statistical property is at core fuzzy structure microcosmos. Recently, hybrid neural structures raised intense attention, resulting various intelligent systems with far-reaching influence. Here, we propose probability-density-based deep learning paradigm for design functional metastructures. contrast other...
Single-cell RNA sequencing (scRNA-seq) has revolutionized the study of cellular heterogeneity by providing gene expression data at single-cell resolution, uncovering insights into rare cell populations, cell-cell interactions, and regulation. Foundation models pretrained on large-scale scRNA-seq datasets have shown great promise in analyzing such data, but existing approaches are often limited to modeling a small subset highly expressed genes lack integration external genespecific knowledge....
Adenylyl Cyclase 3 (AC3) plays an important role in the olfactory sensation-signaling pathway mice. AC3 deficiency leads to defects olfaction. However, it is still unknown whether affects gene expression or signal transduction pathways within main epithelium (MOE). In this study, microarrays were used screen differentially expressed genes MOE from knockout (AC3−/−) and wild-type (AC3+/+) The identified subjected bioinformatic analysis verified by qRT-PCR. Gene AC3−/− mice was significantly...
In financial credit scoring, loan applications may be approved or rejected. We can only observe default/non-default labels for samples but have no observations rejected samples, which leads to missing-not-at-random selection bias. Machine learning models trained on such biased data are inevitably unreliable. this work, we find that the classification task and rejection/approval highly correlated, according both real-world study theoretical analysis. Consequently, of benefit from...
Modeling sequential patterns from data is at the core of various time series forecasting tasks. Deep learning models have greatly outperformed many traditional models, but these black-box generally lack explainability in prediction and decision making. To reveal underlying trend with understandable mathematical expressions, scientists economists tend to use partial differential equations (PDEs) explain highly nonlinear dynamics patterns. However, it usually requires domain expert knowledge a...
Graph Structure Learning (GSL) has recently garnered considerable attention due to its ability optimize both the parameters of Neural Networks (GNNs) and computation graph structure simultaneously. Despite proliferation GSL methods developed in recent years, there is no standard experimental setting or fair comparison for performance evaluation, which creates a great obstacle understanding progress this field. To fill gap, we systematically analyze different scenarios develop comprehensive...
Sequential recommendation can capture user chronological preferences from their historical behaviors, yet the learning of short sequences (cold-start problem) in many benchmark datasets is still an open challenge. Recently, data augmentation with pseudo-prior items generated by Transformers has drawn considerable attention. These methods generate sequentially reverse order to extend original sequences. Nevertheless, performance may dramatically degrade very sequences; most notably,...
Modeling users' dynamic preferences from historical behaviors lies at the core of modern recommender systems. Due to diverse nature user interests, recent advances propose multi-interest networks encode into multiple interest vectors. In real scenarios, corresponding items captured interests are usually retrieved together get exposure and collected training data, which produces dependencies among interests. Unfortunately, may incorrectly concentrate on subtle Misled by these dependencies,...
Recent advances in on-line tutoring systems have brought on an increase the research of Knowledge Tracing, which predicts student's performance coursework exercises over time. Previous researches, such as Bayesian Deep Tracing (DKT) and qDKT, focused either skill-level or question-level. As a result, those methods fail to take question-skill correlations into account. Inspired by Heterogeneous Graph Embedding (HGE), We propose HGE-based knowledge tracing model. In this paper, heterogeneous...
Diffusion models learn to denoise data and the trained denoiser is then used generate new samples from distribution. In this paper, we revisit diffusion sampling process identify a fundamental cause of sample quality degradation: poorly estimated in regions that are far Outside Of training Distribution (OOD), inevitably evaluates these OOD regions. This can become problematic for all methods, especially when move parallel which requires us initialize update entire trajectory dynamics...
<title>Abstract</title> Single-cell RNA sequencing (scRNA-seq) has revolutionized the study of cellular heterogeneity by providing gene expression data at single-cell resolution, uncovering insights into rare cell populations, cell-cell interactions, and regulation. Foundation models pretrained on large-scale scRNA-seq datasets have shown great promise in analyzing such data, but existing approaches are often limited to modeling a small subset highly expressed genes lack integration external...
Abstract The structure of a protein is crucial to its biological function. With the expansion available structures, such as those in AlphaFold Protein Structure Database (AFDB), there an increasing need for efficient methods index, search, and generate these structures. Additionally, growing interest integrating structural information with models from other modalities, sequence language models. We present novel VQ-VAE-based tokenizer, AIDO.StructureTokenizer (AIDO.St), which pretrained...
Machine learning systems are notoriously prone to biased predictions about certain demographic groups, leading algorithmic fairness issues. Due privacy concerns and data quality problems, some information may not be available in the training complex interaction of different demographics can lead a lot unknown minority subpopulations, which all limit applicability group fairness. Many existing works on without assume correlation between groups features. However, we argue that model gradients...
Partial differential equations (PDEs) that fit scientific data can represent physical laws with explainable mechanisms for various mathematically-oriented subjects, such as physics and finance. The data-driven discovery of PDEs from thrives a new attempt to model complex phenomena in nature, but the effectiveness current practice is typically limited by scarcity complexity phenomena. Especially, highly nonlinear coefficients low-quality remains largely under-addressed. To deal this...
Interpretable policy learning seeks to estimate intelligible decision policies from observed actions; however, existing models fall short by forcing a tradeoff between accuracy and interpretability. This limits data-driven interpretations of human decision-making process. e.g. audit medical decisions for biases suboptimal practices, we require processes which provide concise descriptions complex behaviors. Fundamentally, approaches are burdened this because they represent the underlying...
Deep learning models have achieved promising disease prediction performance of the Electronic Health Records (EHR) patients. However, most developed under I.I.D. hypothesis fail to consider agnostic distribution shifts, diminishing generalization ability deep Out-Of-Distribution (OOD) data. In this setting, spurious statistical correlations that may change in different environments will be exploited, which can cause sub-optimal performances models. The unstable correlation between procedures...
The aim of this work is to show that an efficient PL response can be achieved in Er-Yb codoped Al <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> O xmlns:xlink="http://www.w3.org/1999/xlink">3</sub> waveguides by controlling the Er and Yb relative concentrations at same time rare-earth distribution nanoscale.
In quantum mechanics, a norm squared wave function can be interpreted as the probability density that describes likelihood of particle to measured in given position or momentum. This statistical property is at core microcosmos. Meanwhile, machine learning inverse design materials raised intensive attention, resulting various intelligent systems for matter engineering. Here, inspired by theory, we propose probabilistic deep paradigm functional meta-structures. Our probability-density-based...