Yuexiang Xie

ORCID: 0009-0005-6545-7882
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Privacy-Preserving Technologies in Data
  • Advanced Graph Neural Networks
  • Recommender Systems and Techniques
  • Natural Language Processing Techniques
  • Cryptography and Data Security
  • Expert finding and Q&A systems
  • Stochastic Gradient Optimization Techniques
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Advanced Text Analysis Techniques
  • Mobile Crowdsensing and Crowdsourcing
  • Traffic Prediction and Management Techniques
  • Machine Learning and Data Classification
  • Vehicle License Plate Recognition
  • Speech and dialogue systems
  • Fuzzy Logic and Control Systems
  • Catalytic C–H Functionalization Methods
  • Machine Learning and ELM
  • Seismology and Earthquake Studies
  • Multi-Agent Systems and Negotiation
  • Imbalanced Data Classification Techniques
  • Machine Learning in Healthcare
  • Explainable Artificial Intelligence (XAI)
  • Financial Distress and Bankruptcy Prediction

Alibaba Group (China)
2021-2024

Guangxi University
2023

Hunan University of Traditional Chinese Medicine
2020-2023

Alibaba Group (United States)
2023

Chinese University of Hong Kong
2021

Peking University Shenzhen Hospital
2019-2020

Peking University
2019-2020

Although remarkable progress has been made by existing federated learning (FL) platforms to provide infrastructures for development, these may not well tackle the challenges brought various types of heterogeneity. To fill this gap, in paper, we propose a novel FL platform, named FederatedScope, which employs an event-driven architecture users with great flexibility independently describe behaviors different participants. Such design makes it easy participants local training processes, goals...

10.14778/3579075.3579081 article EN Proceedings of the VLDB Endowment 2023-01-01

Large language models (LLMs) have demonstrated great capabilities in various natural understanding and generation tasks. These pre-trained LLMs can be further improved for specific downstream tasks by fine-tuning. However, the adoption of LLM real-world applications hindered privacy concerns resource-intensive nature model training When multiple entities similar interested but cannot directly share their local data due to regulations, federated learning (FL) is a mainstream solution leverage...

10.1145/3637528.3671573 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024-08-24

Recently, graph neural networks (GNN) have been successfully applied to recommender systems as an effective collaborative filtering (CF) approach. However, existing GNN-based CF models suffer from noisy user-item interaction data, which seriously affects the effectiveness and robustness in real-world applications. Although there several studies on data denoising systems, they either neglect direct intervention of message-propagation GNN, or fail preserve diversity recommendation when denoising.

10.1145/3477495.3531889 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

The incredible development of federated learning (FL) has benefited various tasks in the domains computer vision and natural language processing, existing frameworks such as TFF FATE made deployment easy real-world applications. However, graph (FGL), even though data are prevalent, not been well supported due to its unique characteristics requirements. lack FGL-related framework increases efforts for accomplishing reproducible research deploying Motivated by strong demand, this paper, we...

10.1145/3534678.3539112 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

Answer selection and knowledge base question answering (KBQA) are two important tasks of (QA) systems. Existing methods solve these separately, which requires large number repetitive work neglects the rich correlation information between tasks. In this paper, we tackle answer KBQA simultaneously via multi-task learning (MTL), motivated by following motivations. First, both can be regarded as a ranking problem, with one at text-level while other knowledge-level. Second, benefit each other:...

10.1609/aaai.v33i01.33016318 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Community question answering (CQA) gains increasing popularity in both academy and industry recently. However, the redundancy lengthiness issues of crowdsourced answers limit performance answer selection lead to reading difficulties misunderstandings for community users. To solve these problems, we tackle tasks summary generation CQA with a novel joint learning model. Specifically, design question-driven pointer-generator network, which exploits correlation information between...

10.1609/aaai.v34i05.6266 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Despite significant progress has been achieved in text summarization, factual inconsistency generated summaries still severely limits its practical applications. Among the key factors to ensure consistency, a reliable automatic evaluation metric is first and most crucial one. However, existing metrics either neglect intrinsic cause of or rely on auxiliary tasks, leading an unsatisfied correlation with human judgments increasing inconvenience usage practice. In light these challenges, we...

10.18653/v1/2021.findings-emnlp.10 preprint EN cc-by 2021-01-01

As people inevitably interact with items across multiple domains or various platforms, cross-domain recommendation (CDR) has gained increasing attention. However, the rising privacy concerns limit practical applications of existing CDR models, since they assume that full partial data are accessible among different domains. Recent studies on privacy-aware models neglect heterogeneity from multiple-domain and fail to achieve consistent improvements in recommendation; thus, it remains a...

10.1145/3653448 article EN ACM transactions on office information systems 2024-03-21

Retrieval-augmented generation (RAG) has emerged as a promising technology for addressing hallucination issues in the responses generated by large language models (LLMs). Existing studies on RAG primarily focus applying semantic-based approaches to retrieve isolated relevant chunks, which ignore their intrinsic relationships. In this paper, we propose novel Knowledge Graph-Guided Retrieval Augmented Generation (KG$^2$RAG) framework that utilizes knowledge graphs (KGs) provide fact-level...

10.48550/arxiv.2502.06864 preprint EN arXiv (Cornell University) 2025-02-07

Knowledge-intensive conversations supported by large language models (LLMs) have become one of the most popular and helpful applications that can assist people in different aspects. Many current knowledge-intensive are centered on retrieval-augmented generation (RAG) techniques. While many open-source RAG frameworks facilitate development RAG-based applications, they often fall short handling practical scenarios complicated heterogeneous data topics formats, conversational context...

10.48550/arxiv.2502.09596 preprint EN arXiv (Cornell University) 2025-02-13

An intriguing visible-light-induced strategy has been established for the P-H insertion reaction between acylsilanes and H-phosphorus oxides that, upon a subsequent acidic process, deliver wide variety of α-hydroxyphosphorus in good yields (up to 93% yield). The metal-free protocol represents unique example C-P bond formation through situ generation siloxycarbenes. This methodology features advantages operational simplicity, mild conditions, broad substrate scope, column free gram-scale synthesis.

10.1021/acs.orglett.3c00722 article EN Organic Letters 2023-03-28

Federated Learning (FL) aims to train high-quality models in collaboration with distributed clients while not uploading their local data, which attracts increasing attention both academia and industry. However, there is still a considerable gap between the flourishing FL research real-world scenarios, mainly caused by characteristics of heterogeneous devices its scales. Most existing works conduct evaluations homogeneous devices, are mismatched diversity variability scenarios. Moreover, it...

10.1145/3580305.3599829 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

Answer selection, which is involved in many natural language processing applications, such as dialog systems and question answering (QA), an important yet challenging task practice, since conventional methods typically suffer from the issues of ignoring diverse real-world background knowledge. In this article, we extensively investigate approaches to enhancing answer selection model with external knowledge graph (KG). First, present a context-knowledge interaction learning framework,...

10.1145/3457533 article EN ACM transactions on office information systems 2021-09-08

The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, heterogeneous, and high-quality data. A data recipe is a mixture from different sources for training LLMs, which plays vital role LLMs' performance. Existing open-source tools LLM processing are mostly tailored specific recipes. To continuously uncover potential incorporate new sources, improve performance, we build system named Data-Juicer, with can efficiently generate diverse recipes, explore...

10.1145/3626246.3653385 article EN 2024-05-23

LLMs have demonstrated great capabilities in various NLP tasks. Different entities can further improve the performance of those on their specific downstream tasks by fine-tuning LLMs. When several similar interested tasks, but data cannot be shared because privacy concerns regulations, federated learning (FL) is a mainstream solution to leverage different entities. However, settings still lacks adequate support from existing FL frameworks it has deal with optimizing consumption significant...

10.48550/arxiv.2309.00363 preprint EN other-oa arXiv (Cornell University) 2023-01-01

High-order interactive features capture the correlation between different columns and thus are promising to enhance various learning tasks on ubiquitous tabular data. To automate generation of features, existing works either explicitly traverse feature space or implicitly express interactions via intermediate activations some designed models. These two kinds methods show that there is essentially a trade-off interpretability search efficiency. possess both their merits, we propose novel...

10.1145/3447548.3467066 article EN 2021-08-13

We study the community question answering (CQA) problem that emerges with advent of numerous forums in recent past. The task finding appropriate answers to questions from informative but noisy crowdsourced is important yet challenging practice. present an Attentive User-engaged Adversarial Neural Network (AUANN), which interactively learns context information and answers, enhances user engagement CQA task. A novel attentive mechanism incorporated model semantic internal external relations...

10.1609/aaai.v34i05.6472 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Machine Learning methods have been adopted for a wide range of real-world applications, ranging from social networks, online image/video-sharing platforms, and e-commerce to education, healthcare, etc. However, in practice, large amount effort is required tune several components machine learning methods, including data representation, hyperparameter, model architecture, order achieve good performance. To alleviate the tunning efforts, Automated (AutoML), which can automate process applying...

10.1145/3459637.3483279 article EN 2021-10-26

Federated learning (FL) is a general distributed machine paradigm that provides solutions for tasks where data cannot be shared directly. Due to the difficulties in communication management and heterogeneity of devices, initiating using an FL algorithm real-world cross-device scenarios requires significant repetitive effort but may not transferable similar projects. To reduce required developing deploying algorithms, we present FS-Real, open-source platform designed address need efficient...

10.14778/3611540.3611617 article EN Proceedings of the VLDB Endowment 2023-08-01

Although neural networks have achieved great successes in various machine learning tasks, people can hardly know what learn from data due to their black-box nature. The lack of such explainability is one the limitations when applied domains, e.g., healthcare and finance, that demand transparency accountability. Moreover, beneficial for guiding a network causal patterns extrapolate out-of-distribution (OOD) data, which critical real-world applications has surged as hot research topic.

10.1145/3485447.3512023 article EN Proceedings of the ACM Web Conference 2022 2022-04-25

Amide is one of the most important molecules in chemistry and biology. Seeking a green efficient synthesis method for obtaining amide compounds has always been main research direction. In this paper, it was proposed to use acid amine as raw materials synthesize amides under green, fast mechanochemical conditions, purification product only requires water washing filtration. This avoids using large amount organic solvents pollutant generation reaction process, yield up 96%. Even when scaled 50...

10.1016/j.rechem.2023.100882 article EN cc-by-nc-nd Results in Chemistry 2023-01-01
Coming Soon ...