- Advanced Graph Neural Networks
- Topic Modeling
- Recommender Systems and Techniques
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Complex Network Analysis Techniques
- Natural Language Processing Techniques
- Bayesian Modeling and Causal Inference
- Advanced Image and Video Retrieval Techniques
- Machine Learning in Materials Science
- Semantic Web and Ontologies
- Traffic Prediction and Management Techniques
- Machine Learning and ELM
- Traffic control and management
- Financial Markets and Investment Strategies
- Graph Theory and Algorithms
- Neural Networks and Applications
- Stock Market Forecasting Methods
- Transportation Planning and Optimization
- Time Series Analysis and Forecasting
- Advanced Text Analysis Techniques
- Oil and Gas Production Techniques
- Computational Drug Discovery Methods
- Explainable Artificial Intelligence (XAI)
- Video Analysis and Summarization
University of California, Los Angeles
2019-2024
Hunan University
2021-2023
Jiangsu University
2021
Peking University
2017-2019
Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data. However, most GNNs are designed homogeneous graphs, in which all nodes and edges belong to same types, making it infeasible represent heterogeneous structures. In this paper, we present Heterogeneous Graph Transformer (HGT) architecture Web-scale graphs. To model heterogeneity, design node- edge-type dependent parameters characterize attention over each edge, empowering HGT maintain...
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs requires abundant task-specific labeled data, which is often arduously expensive obtain. One effective way reduce the labeling effort pre-train an expressive GNN model on unlabelled data with self-supervision and then transfer learned downstream tasks only a few labels. In this paper, we present GPT-GNN framework initialize by generative pre-training. introduces...
Representation learning has offered a revolutionary paradigm for various AI domains. In this survey, we examine and review the problem of representation with focus on heterogeneous networks, which consists different types vertices relations. The goal is to automatically project objects, most commonly, vertices, in an input network into latent embedding space such that both structural relational properties can be encoded preserved. embeddings (representations) then used as features machine...
Pre-training Graph Neural Networks (GNN) via self-supervised contrastive learning has recently drawn lots of attention. However, most existing works focus on node-level learning, which cannot capture global graph structure. The key challenge to conduct subgraph-level is sample informative subgraphs that are semantically meaningful. To solve it, we propose learn motifs, frequently-occurring subgraph patterns (e.g. functional groups molecules), for better sampling. Our framework <underline...
Graph convolutional networks (GCNs) have recently received wide attentions, due to their successful applications in different graph tasks and domains. Training GCNs for a large graph, however, is still challenge. Original full-batch GCN training requires calculating the representation of all nodes per layer, which brings high computation memory costs. To alleviate this issue, several sampling-based methods been proposed train on subset nodes. Among them, node-wise neighbor-sampling method...
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs usually requires abundant task-specific labeled data, which is often arduously expensive obtain. One effective way reduce the labeling effort pre-train an expressive GNN model on unlabeled data with self-supervision and then transfer learned downstream tasks only a few labels. In this paper, we present GPT-GNN framework initialize by generative pre-training. introduces...
Answering complex First-Order Logical (FOL) queries on large-scale incomplete knowledge graphs (KGs) is an important yet challenging task. Recent advances embed logical and KG entities in the same space conduct query answering via dense similarity search. However, most operators designed previous studies do not satisfy axiomatic system of classical logic, limiting their performance. Moreover, these are parameterized thus require many FOL as training data, which often arduous to collect or...
In this paper, we propose an end-to-end Retrieval-Augmented Visual Language Model (REVEAL) that learns to encode world knowledge into a large-scale memory, and retrieve from it answer knowledge-intensive queries. Reveal consists of four key components: the encoder, retriever generator. The memory encodes various sources multimodal (e.g. image-text pairs, question answering graph triplets, etc.) via unified encoder. finds most relevant entries in generator fuses retrieved with input query...
Stock trend prediction plays a critical role in seeking maximized profit from stock investment. However, precise is very difficult since the highly volatile and non-stationary nature of market. Exploding information on Internet together with advancing development natural language processing text mining techniques have enable investors to unveil market trends volatility online content. Unfortunately, quality, trustworthiness comprehensiveness content related varies drastically, large portion...
Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data. However, most GNNs are designed homogeneous graphs, in which all nodes and edges belong to same types, making them infeasible represent heterogeneous structures. In this paper, we present Heterogeneous Graph Transformer (HGT) architecture Web-scale graphs. To model heterogeneity, design node- edge-type dependent parameters characterize attention over each edge, empowering HGT...
Commonsense is defined as the knowledge on which everyone agrees. However, certain types of commonsense are correlated with culture and geographic locations they only shared locally. For example, scenes wedding ceremonies vary across regions due to different customs influenced by historical religious factors. Such regional characteristics, however, generally omitted in prior work. In this paper, we construct a Geo-Diverse Visual Reasoning dataset (GD-VCR) test vision-and-language models’...
Graph neural networks (GNNs) are emerging for machine learning research on graph-structured data. GNNs achieve state-of-the-art performance many tasks, but they face scalability challenges when it comes to real-world applications that have numerous data and strict latency requirements. Many studies been conducted how accelerate in an effort address these challenges. These acceleration techniques touch various aspects of the GNN pipeline, from smart training inference algorithms efficient...
Graph neural networks (GNNs) are shown to be successful in modeling applications with graph structures. However, training an accurate GNN model requires a large collection of labeled data and expressive features, which might inaccessible for some applications. To tackle this problem, we propose pre-training framework that captures generic structural information is transferable across tasks. Our can leverage the following three tasks: 1) denoising link reconstruction, 2) centrality score...
Multi-Task Learning (MTL) is a powerful learning paradigm to improve generalization performance via knowledge sharing. However, existing studies find that MTL could sometimes hurt generalization, especially when two tasks are less correlated. One possible reason hurts spurious correlation, i.e., some and not causally related task labels, but the model mistakenly utilize them thus fail such correlation changes. In setup, there exist several unique challenges of correlation. First, risk having...
Most of the existing Large Language Model (LLM) benchmarks on scientific problem reasoning focus problems grounded in high-school subjects and are confined to elementary algebraic operations. To systematically examine capabilities required for solving complex problems, we introduce an expansive benchmark suite SciBench LLMs. contains a carefully curated dataset featuring range collegiate-level from mathematics, chemistry, physics domains. Based dataset, conduct in-depth benchmarking study...
Pre-training Graph Neural Networks (GNN) via self-supervised contrastive learning has recently drawn lots of attention. However, most existing works focus on node-level learning, which cannot capture global graph structure. The key challenge to conducting subgraph-level is sample informative subgraphs that are semantically meaningful. To solve it, we propose learn motifs, frequently-occurring subgraph patterns (e.g. functional groups molecules), for better sampling. Our framework...
Answering complex open-domain questions requires understanding the latent relations between involving entities. However, we found that existing QA datasets are extremely imbalanced in some types of relations, which hurts generalization performance over with long-tail relations. To remedy this problem, paper, propose a Relation-Guided Pre-Training (RGPT-QA) framework. We first generate relational dataset covering wide range from both Wikidata triplets and Wikipedia hyperlinks. then pre-train...
Traffic signal control is critical for traffic efficiency optimization but usually constrained by detection methods. The emerging V2I (Vehicle to Infrastructure) technology capable of providing rich information detection, thus becoming promising control. Based on parallel simulation, this paper presents a new method in environment. In the proposed method, predictive problem formulated, and cellular automata model employed as flow model. By using genetic algorithm, solved online implement...
The limited availability of annotations in small molecule datasets presents a challenge to machine learning models. To address this, one common strategy is collaborate with additional auxiliary datasets. However, having more data does not always guarantee improvements. Negative transfer can occur when the knowledge target dataset differs or contradicts that In light identifying benefit jointly trained remains critical and unresolved problem. Through an empirical analysis, we observe...
Data continuously emitted from industrial ecosystems such as social or e-commerce platforms are commonly represented heterogeneous graphs (HG) composed of multiple node/edge types. State-of-the-art graph learning methods for HGs known neural networks (HGNNs) applied to learn deep context-informed node representations. However, many HG datasets applications suffer label imbalance between As there is no direct way using labels rooted at different types, HGNNs have been only a few types with...