Zhongxiang Sun

ORCID: 0000-0002-6109-4704
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Recommender Systems and Techniques
  • Topic Modeling
  • Artificial Intelligence in Law
  • Advanced Bandit Algorithms Research
  • Seismology and Earthquake Studies
  • Comparative and International Law Studies
  • Context-Aware Activity Recognition Systems
  • Methane Hydrates and Related Phenomena
  • Digital Marketing and Social Media
  • Legal Education and Practice Innovations
  • Advanced Database Systems and Queries
  • Advanced Graph Neural Networks
  • Speech and dialogue systems
  • Schizophrenia research and treatment
  • Legal Language and Interpretation
  • Transportation and Mobility Innovations
  • Advanced Data Storage Technologies
  • Parallel Computing and Optimization Techniques
  • Machine Learning in Healthcare
  • Advanced Text Analysis Techniques
  • Multimodal Machine Learning Applications
  • Image Retrieval and Classification Techniques
  • Urban and Freight Transport Logistics
  • Caching and Content Delivery
  • Computational and Text Analysis Methods

Renmin University of China
2022-2024

Beijing Academy of Artificial Intelligence
2024

Inspur (China)
2024

Beijing Jiaotong University
2023

OriginWater (China)
2023

Kuaishou (China)
2023

Huawei Technologies (China)
2022

The debut of ChatGPT has recently attracted significant attention from the natural language processing (NLP) community and beyond. Existing studies have demonstrated that shows improvement in a range downstream NLP tasks, but capabilities limitations terms recommendations remain unclear. In this study, we aim to enhance ChatGPT's recommendation by aligning it with traditional information retrieval (IR) ranking capabilities, including point-wise, pair-wise, list-wise ranking. To achieve goal,...

10.1145/3604915.3610646 preprint EN 2023-09-14

Large language models (LLMs) have transformed many fields, including natural processing, computer vision, and reinforcement learning. These also made a significant impact in the field of law, where they are being increasingly utilized to automate various legal tasks, such as judgement prediction, document analysis, writing. However, integration LLMs into has raised several problems, privacy concerns, bias, explainability. In this survey, we explore law. We discuss applications examine...

10.48550/arxiv.2303.09136 preprint EN cc-by arXiv (Cornell University) 2023-01-01

As an essential operation of legal retrieval, case matching plays a central role in intelligent systems. This task has high demand on the explainability results because its critical impacts downstream applications -- matched cases may provide supportive evidence for judgments target and thus influence fairness justice decisions. Focusing this challenging task, we propose novel explainable method, namely \textit{IOT-Match}, with help computational optimal transport, which formulates problem...

10.1145/3477495.3531974 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

Retrieval-Augmented Generation (RAG) systems for Large Language Models (LLMs) hold promise in knowledge-intensive tasks but face limitations complex multi-step reasoning. While recent methods have integrated RAG with chain-of-thought reasoning or test-time search using Process Reward (PRMs), these approaches encounter challenges such as a lack of explanations, bias PRM training data, early-step scores, and insufficient post-training optimization potential. To address issues, we propose...

10.48550/arxiv.2501.07861 preprint EN arXiv (Cornell University) 2025-01-14

In search scenarios, user experience can be hindered by erroneous queries due to typos, voice errors, or knowledge gaps. Therefore, query correction is crucial for engines. Current models, usually small models trained on specific data, often struggle with beyond their training scope those requiring contextual understanding. While the advent of Large Language Models (LLMs) offers a potential solution, they are still limited pre-training data and inference cost, particularly complex queries,...

10.1609/aaai.v39i12.33447 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Modern online service providers such as shopping platforms often provide both search and recommendation (S&R) services to meet different user needs. Rarely has there been any effective means of incorporating behavior data from S&R services. Most existing approaches either simply treat behaviors separately, or jointly optimize them by aggregating services, ignoring the fact that intents in can be distinctively different. In our paper, we propose a Search-Enhanced framework for Sequential...

10.1145/3539618.3591786 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

In intelligent logistics systems, predicting the Estimated Time of Pick-up Arrival (ETPA) packages is a crucial task, which aims to predict courier’s arrival time all unpicked-up at any time. Accurate prediction ETPA can help systems alleviate customers’ waiting anxiety and improve their experience. We identify three main challenges this problem. First, unlike travel estimation problem in other fields like ride-hailing, task distinctively multi-destination path-free Second, an intuitive idea...

10.1145/3582561 article EN ACM Transactions on Intelligent Systems and Technology 2023-02-08

Ensuring the long-term sustainability of recommender systems (RS) emerges as a crucial issue. Traditional offline evaluation methods for RS typically focus on immediate user feedback, such clicks, but they often neglect impact content creators. On real-world platforms, creators can strategically produce and upload new items based feedback preference trends. While previous studies have attempted to model creator behavior, overlook role information asymmetry. This asymmetry arises because...

10.48550/arxiv.2502.07307 preprint EN arXiv (Cornell University) 2025-02-11

Recommender systems are currently widely used in various applications helping people filter information. Existing models always embed the rich information for recommendation, such as items, users, and contexts real-value vectors, make predictions based on these vectors. In view of causal inference, associations between representation vectors user feedback inevitably a mixture part that describes why prefers an item, non-causal merely reflects statistical dependencies, example, display...

10.1145/3582425 article EN ACM transactions on office information systems 2023-02-01

Legal case matching, which automatically constructs a model to estimate the similarities between source and target cases, has played an essential role in intelligent legal systems. Semantic text matching models have been applied task where cases are considered as long-form documents. These general-purpose make predictions solely based on texts overlooking of law articles matching. In real world, results (e.g., relevance labels) dramatically affected by because contents judgments radically...

10.1145/3539618.3591709 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

The confluence of Search and Recommendation (S&R) services is vital to online services, including e-commerce video platforms. integration S&R modeling a highly intuitive approach adopted by industry practitioners. However, there noticeable lack research conducted in this area within academia, primarily due the absence publicly available datasets. Consequently, substantial gap has emerged between academia regarding endeavors joint optimization using user behavior data from both services. To...

10.1145/3583780.3615123 article EN 2023-10-21

Providing human-understandable explanations for the matching predictions is still challenging current legal case methods. One difficulty that cases are semi-structured text documents with complicated case-case and case-law article correlations. To tackle issue, we propose a novel graph optimal transport (GOT)-based model able to provide not only but also plausible faithful prediction. The model, called GEIOT-Match, first constructs heterogeneous explicitly represent nature of their...

10.1109/tkde.2023.3321935 article EN IEEE Transactions on Knowledge and Data Engineering 2023-10-13

In this paper, we address the issue of using logic rules to explain results from legal case retrieval. The task is critical retrieval because users (e.g., lawyers or judges) are highly specialized and require system provide logical, faithful, interpretable explanations before making decisions. Recently, research efforts have been made learn explainable models. However, these methods usually select rationales (key sentences) cases as explanations, failing faithful logically correct...

10.48550/arxiv.2403.01457 preprint EN arXiv (Cornell University) 2024-03-03

The retrieval phase is a vital component in recommendation systems, requiring the model to be effective and efficient. Recently, generative has become an emerging paradigm for document retrieval, showing notable performance. These methods enjoy merits like being end-to-end differentiable, suggesting their viability recommendation. However, these fall short efficiency effectiveness large-scale recommendations. To obtain effectiveness, this paper introduces framework, namely SEATER, which...

10.48550/arxiv.2309.13375 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

Recent advancements in Large Language Models (LLMs) have attracted considerable interest among researchers to leverage these models enhance Recommender Systems (RSs). Existing work predominantly utilizes LLMs generate knowledge-rich texts or LLM-derived embeddings as features improve RSs. Although the extensive world knowledge embedded generally benefits RSs, application can only take limited number of users and items inputs, without adequately exploiting collaborative filtering information....

10.48550/arxiv.2403.17688 preprint EN arXiv (Cornell University) 2024-03-26

10.1145/3626772.3657732 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024-07-10

Recent research on query generation has focused using Large Language Models (LLMs), which despite bringing state-of-the-art performance, also introduce issues with hallucinations in the generated queries. In this work, we relevance hallucination and factuality as a new typology for problems brought by based LLMs. We propose an effective way to separate content from form LLM-generated queries, preserves factual knowledge extracted integrated inputs compiles syntactic structure, including...

10.48550/arxiv.2410.11366 preprint EN arXiv (Cornell University) 2024-10-15

Retrieval-Augmented Generation (RAG) models are designed to incorporate external knowledge, reducing hallucinations caused by insufficient parametric (internal) knowledge. However, even with accurate and relevant retrieved content, RAG can still produce generating outputs that conflict the information. Detecting such requires disentangling how Large Language Models (LLMs) utilize Current detection methods often focus on one of these mechanisms or without decoupling their intertwined effects,...

10.48550/arxiv.2410.11414 preprint EN arXiv (Cornell University) 2024-10-15

In recommender systems, the retrieval phase is at first stage and of paramount importance, requiring both effectiveness very high efficiency. Recently, generative methods such as DSI NCI, offering benefit end-to-end differentiability, have become an emerging paradigm for document with notable performance improvement, suggesting their potential applicability in recommendation scenarios. A fundamental limitation these approach generating item identifiers text inputs, which fails to capture...

10.1145/3673791.3698408 article EN 2024-12-08

In search scenarios, user experience can be hindered by erroneous queries due to typos, voice errors, or knowledge gaps. Therefore, query correction is crucial for engines. Current models, usually small models trained on specific data, often struggle with beyond their training scope those requiring contextual understanding. While the advent of Large Language Models (LLMs) offers a potential solution, they are still limited pre-training data and inference cost, particularly complex queries,...

10.48550/arxiv.2412.12701 preprint EN arXiv (Cornell University) 2024-12-17

The explosion of data over the last decades puts significant strain on computational capacity central processing unit (CPU), challenging online analytical (OLAP). While previous studies have shown potential using Field Programmable Gate Arrays (FPGAs) in database systems, integrating FPGA-based hardware acceleration with relational databases remains because complex nature operations and need for specialized FPGA programming skills. Additionally, there are challenges related to optimizing...

10.1016/j.parco.2024.103064 article EN cc-by Parallel Computing 2024-02-01

Incorporating Search and Recommendation (S&R) services within a singular application is prevalent in online platforms, leading to new task termed open-app motivation prediction, which aims predict whether users initiate the with specific intent of information searching, or explore recommended content for entertainment. Studies have shown that predicting users' open an app can help improve user engagement enhance performance various downstream tasks. However, accurately not trivial, as it...

10.48550/arxiv.2404.03267 preprint EN arXiv (Cornell University) 2024-04-04

The significance of modeling long-term user interests for CTR prediction tasks in large-scale recommendation systems is progressively gaining attention among researchers and practitioners. Existing work, such as SIM TWIN, typically employs a two-stage approach to model behavior sequences efficiency concerns. first stage rapidly retrieves subset related the target item from long sequence using search-based mechanism namely General Search Unit (GSU), while second calculates interest scores...

10.1145/3627673.3680030 preprint EN 2024-10-20

Legal case matching, which automatically constructs a model to estimate the similarities between source and target cases, has played an essential role in intelligent legal systems. Semantic text matching models have been applied task where cases are considered as long-form documents. These general-purpose make predictions solely based on texts overlooking of law articles matching. In real world, results (e.g., relevance labels) dramatically affected by because contents judgments radically...

10.48550/arxiv.2210.11012 preprint EN other-oa arXiv (Cornell University) 2022-01-01
Coming Soon ...