Yizhou Sun

ORCID: 0000-0003-1812-6843
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Graph Neural Networks
  • Topic Modeling
  • Complex Network Analysis Techniques
  • Natural Language Processing Techniques
  • Recommender Systems and Techniques
  • Graph Theory and Algorithms
  • Text and Document Classification Technologies
  • Bioinformatics and Genomic Networks
  • Data Management and Algorithms
  • Machine Learning in Materials Science
  • Bayesian Modeling and Causal Inference
  • Data Quality and Management
  • Opinion Dynamics and Social Influence
  • Web Data Mining and Analysis
  • Neural Networks and Applications
  • Domain Adaptation and Few-Shot Learning
  • Biomedical Text Mining and Ontologies
  • Advanced Clustering Algorithms Research
  • Semantic Web and Ontologies
  • Human Mobility and Location-Based Analysis
  • Time Series Analysis and Forecasting
  • Computational Drug Discovery Methods
  • Multimodal Machine Learning Applications
  • Advanced Database Systems and Queries
  • Sentiment Analysis and Opinion Mining

University of California, Los Angeles
2016-2025

Shantou University
2022-2025

First Hospital of China Medical University
2024

Key Laboratory of Guangdong Province
2023-2024

China Medical University
2024

Amazon (United States)
2023

Aalborg University
2023

University of California System
2018-2022

Auckland University of Technology
2022

North China Electric Power University
2022

Similarity search is a primitive operation in database and Web engines. With the advent of large-scale heterogeneous information networks that consist multi-typed, interconnected objects, such as bibliographic social media networks, it important to study similarity networks. Intuitively, two objects are similar if they linked by many paths network. However, most existing measures defined for homogeneous Different semantic meanings behind not taken into consideration. Thus cannot be directly...

10.14778/3402707.3402736 article EN Proceedings of the VLDB Endowment 2011-08-01

Most real systems consist of a large number interacting, multi-typed components, while most contemporary researches model them as homogeneous information networks, without distinguishing different types objects and links in the networks. Recently, more researchers begin to consider these interconnected, data heterogeneous develop structural analysis approaches by leveraging rich semantic meaning Compared widely studied network, network contains richer structure information, which provides...

10.1109/tkde.2016.2598561 article EN publisher-specific-oa IEEE Transactions on Knowledge and Data Engineering 2016-08-08

Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data. However, most GNNs are designed homogeneous graphs, in which all nodes and edges belong to same types, making it infeasible represent heterogeneous structures. In this paper, we present Heterogeneous Graph Transformer (HGT) architecture Web-scale graphs. To model heterogeneity, design node- edge-type dependent parameters characterize attention over each edge, empowering HGT maintain...

10.1145/3366423.3380027 article EN 2020-04-20

Among different hybrid recommendation techniques, network-based entity methods, which utilize user or item relationship information, are beginning to attract increasing attention recently. Most of the previous studies in this category only consider a single type, such as friendships social network. In many scenarios, problem exists heterogeneous information network environment. Different types relationships can be potentially used improve quality. paper, we study networks. Specifically,...

10.1145/2556195.2556259 article EN 2014-02-18

Most objects and data in the real world are of multiple types, interconnected, forming complex, heterogeneous but often semi-structured information networks. However, most network science researchers focused on homogeneous networks, without distinguishing different types links We view multityped data, including typical relational database as study how to leverage rich semantic meaning structural develop a analysis approach mining semi-structured, multi-typed In this article, we summarize set...

10.1145/2481244.2481248 article EN ACM SIGKDD Explorations Newsletter 2013-04-30

Real-world physical and abstract data objects are interconnected, forming gigantic, interconnected networks. By structuring these interactions between into multiple types, such networks become semi-structured heterogeneous information Most real-world applications that handle big data, including social media networks, scientific, engineering, or medical systems, online e-commerce most database can be structured Therefore, effective analysis of large-scale poses an interesting but critical challenge.

10.2200/s00433ed1v01y201207dmk005 article EN Synthesis lectures on data mining and knowledge discovery 2012-07-18

The problem of predicting links or interactions between objects in a network, is an important task network analysis. Along this line, link prediction co-authors co-author frequently studied problem. In most these studies, authors are considered homogeneous i.e., only one type (author type) and (co-authorship) exist the network. However, real bibliographic there multiple types (e.g., venues, topics, papers) among objects. paper, we study relationship heterogeneous new methodology called...

10.1109/asonam.2011.112 article EN 2011-07-01

As information networks become ubiquitous, extracting knowledge from has an important task. Both ranking and clustering can provide overall views on network data, each been a hot topic by itself. However, objects globally without considering which clusters they belong to often leads dumb results, e.g., database computer architecture conferences together may not make much sense. Similarly, huge number of (e.g., thousands authors) in one cluster distinction is dull as well.

10.1145/1516360.1516426 article EN 2009-03-24

Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. However, training GNNs requires abundant task-specific labeled data, which is often arduously expensive obtain. One effective way reduce the labeling effort pre-train an expressive GNN model on unlabelled data with self-supervision and then transfer learned downstream tasks only a few labels. In this paper, we present GPT-GNN framework initialize by generative pre-training. introduces...

10.1145/3394486.3403237 article EN 2020-08-20

Newly emerging location-based and event-based social network services provide us with a new platform to understand users' preferences based on their activity history. A user can only visit limited number of venues/events most them are within distance range, so the user-item matrix is very sparse, which creates big challenge for traditional collaborative filtering-based recommender systems. The problem becomes more challenging when people travel city where they have no

10.1145/2487575.2487608 article EN 2013-08-11

Since real-world objects and their interactions are often multi-modal multi-typed, heterogeneous networks have been widely used as a more powerful, realistic, generic superclass of traditional homogeneous (graphs). Meanwhile, representation learning (a.k.a. embedding) has recently intensively studied shown effective for various network mining analytical tasks. In this work, we aim to provide unified framework deeply summarize evaluate existing research on embedding (HNE), which includes but...

10.1109/tkde.2020.3045924 article EN publisher-specific-oa IEEE Transactions on Knowledge and Data Engineering 2020-12-21

Graph similarity search is among the most important graph-based applications, e.g. finding chemical compounds that are similar to a query compound. similarity/distance computation, such as Edit Distance (GED) and Maximum Common Subgraph (MCS), core operation of graph many other but very costly compute in practice. Inspired by recent success neural network approaches several node or classification, we propose novel based approach address this classic yet challenging problem, aiming alleviate...

10.1145/3289600.3290967 article EN 2019-01-30

Link prediction, i.e., predicting links or interactions between objects in a network, is an important task network analysis. Although the problem has attracted much attention recently, there are several challenges that have not been addressed so far. First, most existing studies focus only on link prediction homogeneous networks, where all and belong to same type. However, real world, heterogeneous networks consist of multi-typed relationships ubiquitous. Second, current concern whether will...

10.1145/2124295.2124373 article EN 2012-02-08

In this paper, we study the problem of author identification under double-blind review setting, which is to identify potential authors given information an anonymized paper. Different from existing approaches that rely heavily on feature engineering, propose use network embedding approach address problem, can automatically represent nodes into lower dimensional vectors. However, there are two major limitations in recent studies embedding: (1) they usually general-purpose methods, independent...

10.1145/3018661.3018735 preprint EN 2017-02-02

Real-world, multiple-typed objects are often interconnected, forming heterogeneous information networks. A major challenge for link-based clustering in such networks is its potential to generate many different results, carrying rather diverse semantic meanings. In order desired clustering, we propose use meta-path, a path that connects object types via sequence of relations, control with distinct semantics. Nevertheless, it easier user provide few examples ("seeds") than weighted combination...

10.1145/2339530.2339738 article EN 2012-08-12

A heterogeneous information network (HIN) is a graph model in which objects and edges are annotated with types. Large complex databases, such as YAGO DBLP, can be modeled HINs. fundamental problem HINs the computation of closeness, or relevance, between two HIN objects. Relevance measures used various applications, including entity resolution, recommendation, retrieval. Several studies have investigated use for relevance computation, however, most them only utilize simple structure, path, to...

10.1145/2939672.2939815 article EN 2016-08-08

Recent advances in neural networks have inspired people to design hybrid recommendation algorithms that can incorporate both (1) user-item interaction information and (2) content including image, audio, text. Despite their promising results, network-based pose extensive computational costs, making it challenging scale improve upon. In this paper, we propose a general framework, which subsumes several existing state-of-the-art algorithms, address the efficiency issue by investigating sampling...

10.1145/3097983.3098202 article EN 2017-08-04

Many large-scale knowledge bases simultaneously represent two views of graphs (KGs): an ontology view for abstract and commonsense concepts, instance specific entities that are instantiated from ontological concepts. Existing KG embedding models, however, merely focus on representing one the alone. In this paper, we propose a novel two-view model, JOIE, with goal to produce better enable new applications rely multi-view knowledge. JOIE employs both cross-view intra-view modeling learn...

10.1145/3292500.3330838 article EN 2019-07-25

Linked or networked data are ubiquitous in many applications. Examples include web hypertext documents connected via hyperlinks, social networks user profiles friend links, co-authorship and citation information, blog data, movie reviews so on. In these datasets (called "information networks"), closely related objects that share the same properties interests form a community. For example, community blogsphere could be users mostly interested cell phone news. Outlier detection information can...

10.1145/1835804.1835907 article EN 2010-07-25

With the ubiquity of information networks and their broad applications, issue similarity computation between entities an network arises draws extensive research interests. However, to effectively comprehensively measure "how similar two are within network" is nontrivial, problem becomes even more challenging when be examined massive diverse. In this paper, we propose a new measure, P-Rank (Penetrating Rank), toward computing structural similarities in real networks. enriches well-known...

10.1145/1645953.1646025 article EN 2009-11-02

Recent studies suggest that by using additional user or item relationship information when building hybrid recommender systems, the recommendation quality can be largely improved. However, most such only consider a single type of relationship, e.g., social network. Notice in many applications, problem exists an attribute-rich heterogeneous network environment. In this paper, we study entity networks. We propose to combine various from with feedback provide high results.

10.1145/2507157.2507230 article EN 2013-10-12

Information networks are ubiquitous in many applications and analysis on such has attracted significant attention the academic communities. One of most important aspects information network is to measure similarity between nodes a network. SimRank simple influential this kind, based solid theoretical "random surfer" model. Existing work computes scores an iterative mode. We argue that method can be infeasible inefficient when, as real-world scenarios, change dynamically frequently. envision...

10.1145/1739041.1739098 article EN 2010-03-16
Coming Soon ...