Shuguang Han

ORCID: 0000-0003-1416-6960
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Information Retrieval and Search Behavior
  • Expert finding and Q&A systems
  • Recommender Systems and Techniques
  • Mobile Crowdsensing and Crowdsourcing
  • Optimization and Search Problems
  • Web Data Mining and Analysis
  • Advanced Bandit Algorithms Research
  • Advanced Text Analysis Techniques
  • Topic Modeling
  • Wikis in Education and Collaboration
  • Supply Chain and Inventory Management
  • Complex Network Analysis Techniques
  • Advanced Manufacturing and Logistics Optimization
  • Human Mobility and Location-Based Analysis
  • Scheduling and Optimization Algorithms
  • Semantic Web and Ontologies
  • Image and Video Quality Assessment
  • Data Stream Mining Techniques
  • Personal Information Management and User Behavior
  • Misinformation and Its Impacts
  • Consumer Market Behavior and Pricing
  • Data Quality and Management
  • Graph Labeling and Dimension Problems
  • Auction Theory and Applications
  • Natural Language Processing Techniques

Zhejiang Sci-Tech University
2007-2025

Alibaba Group (China)
2021-2022

University of Alberta
2020-2022

Alibaba Group (United States)
2021

Google (United States)
2019-2020

University of Pittsburgh
2012-2018

Wuhan University
2008-2011

Zhejiang University
2006-2011

Zhejiang University of Science and Technology
2006

Keyphrase provides highly-summative information that can be effectively used for understanding, organizing and retrieving text content. Though previous studies have provided many workable solutions automated keyphrase extraction, they commonly divided the to-be-summarized content into multiple chunks, then ranked selected most meaningful ones. These approaches could neither identify keyphrases do not appear in text, nor capture real semantic meaning behind text. We propose a generative model...

10.18653/v1/p17-1054 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

This paper describes a machine learning algorithm for document (re)ranking, in which queries and documents are firstly encoded using BERT [1], on top of that learning-to-rank (LTR) model constructed with TF-Ranking (TFR) [2] is applied to further optimize the ranking performance. approach proved be effective public MS MARCO benchmark [3]. Our first two submissions achieve best performance passage re-ranking task [4], second full-ranking as April 10, 2020 [5]. To leverage lately development...

10.48550/arxiv.2004.08476 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Learning to Rank, a central problem in information retrieval, is class of machine learning algorithms that formulate ranking as an optimization task. The objective learn function produces ordering set documents such way the utility entire ordered list maximized. Learning-to-rank methods do so by computes score for each document set. A ranked then compiled sorting according their scores. While deterministic mapping scores permutations makes sense during inference where stability lists...

10.1145/3336191.3371844 article EN 2020-01-20

Deep learning techniques have been applied widely in industrial recommendation systems. However, far less attention has paid on the overfitting problem of models systems, which, contrary, is recognized as a critical issue for deep neural networks. In context Click-Through Rate (CTR) prediction, we observe an interesting one-epoch problem: model performance exhibits dramatic degradation at beginning second epoch. Such phenomenon witnessed real-world applications CTR models. Thereby, best...

10.1145/3511808.3557479 article EN Proceedings of the 31st ACM International Conference on Information & Knowledge Management 2022-10-15

The node representation learning capability of Graph Convolutional Networks (GCNs) is fundamentally constrained by dynamic instability during feature propagation, yet existing research lacks systematic theoretical analysis stability control mechanisms. This paper proposes a Stability-Optimized Network (SO-GCN) that enhances training and expressiveness in shallow architectures through continuous–discrete dual-domain constraints. By constructing continuous dynamical equations for GCNs...

10.3390/math13050761 article EN cc-by Mathematics 2025-02-26

The two primary tasks in the search recommendation system are relevance matching and click-through rate (CTR) prediction -- former focuses on seeking relevant items for user queries whereas latter forecasts which item may better match interest. Prior research typically develops models to predict CTR separately, then ranking candidate based fusion of outputs. However, such a divide-and-conquer paradigm creates inconsistency between different models. Meanwhile, model mainly concentrates degree...

10.48550/arxiv.2503.18395 preprint EN arXiv (Cornell University) 2025-03-24

Mobile devices enable people to look for information at the moment when their needs are triggered. While experiencing complex that require multiple search sessions, users may utilize desktop computers fulfill started on mobile devices. Under context of mobile-to-desktop web search, this article analyzes users’ behavioral patterns and compares them in desktop-to-desktop search. Then, we examine several approaches using Touch Interactions (MTIs) infer relevant content so such can be used...

10.1145/2738036 article EN ACM transactions on office information systems 2015-04-23

An industrial recommender system generally presents a hybrid list that contains results from multiple subsystems. In practice, each subsystem is optimized with its own feedback data to avoid the disturbance among different However, we argue such usage may lead sub-optimal online performance because of thedata sparsity. To alleviate this issue, propose extract knowledge thesuper-domain web-scale and long-time impression data, further assist recommendation task (downstream task). end, novel...

10.1145/3511808.3557106 article EN Proceedings of the 31st ACM International Conference on Information & Knowledge Management 2022-10-16

People search is an active research topic in recent years. Related works includes expert finding, collaborator recommendation, link prediction and social matching. However, the diverse objectives exploratory nature of those tasks make it difficult to develop a flexible method for people that every task. In this project, we developed PeopleExplorer, interactive system support when looking people. system, users could specify their task by selecting adjusting key criteria. Three criteria were...

10.1145/2505515.2505684 article EN 2013-10-27

Investigations of search processes that involve complex interactions, such as collaborative processes, are important research topics. Previous approaches directly applying individual process models into settings have proven to be problematic. In this paper, we proposed an innovative approach model using Hidden Markov Model (HMM), which is automatic technique for analyzing temporal sequential data. Obtained through a user study, the data used in paper consist two different tasks both...

10.1145/2531602.2531658 article EN 2014-02-07

Conversion rate (CVR) prediction is one of the most critical tasks for digital display advertising. Commercial systems often require to update models in an online learning manner catch up with evolving data distribution. However, conversions usually do not happen immediately after user clicks. This may result inaccurate labeling, which called delayed feedback problem. In previous studies, problem handled either by waiting positive label a long period time, or consuming negative sample on its...

10.1609/aaai.v35i5.16587 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

As the focus on environmental sustainability sharpens, significance of low-carbon manufacturing and energy conservation continues to rise. While traditional flexible job shop scheduling strategies are primarily concerned with minimizing completion times, they often overlook consumption machines. To address this gap, paper introduces a novel solution utilizing deep reinforcement learning. The study begins by defining Low-carbon Flexible Job Shop Scheduling problem (LC-FJSP) constructing...

10.3390/su16114544 article EN Sustainability 2024-05-27

In this paper, we report the results of our participation in TREC-COVID challenge. To meet challenge building a search engine for rapidly evolving biomedical collection, propose simple yet effective weighted hierarchical rank fusion approach, that ensembles together 102 runs from (a) lexical and semantic retrieval systems, (b) pre-trained fine-tuned BERT rankers, (c) relevance feedback runs. Our ablation studies demonstrate contributions each these systems to overall ensemble. The submitted...

10.48550/arxiv.2010.00200 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Past analysis has considered query reformulation primarily from the perspective of individual Web searches. Findings a recent study suggest ways that collaboration during search process influences how users generate new terms for reformulation.

10.1109/mc.2014.62 article EN Computer 2014-03-01

Although the volume of online educational resources has dramatically increased in recent years, many these are isolated and distributed diverse websites databases. This hinders discovery overall usage resources. By using linking between related subsections textbooks as a testbed, this paper explores multiple knowledge-based content algorithms for connecting We focus on examining semantic-based methods identifying important knowledge components their usefulness book subsections. To overcome...

10.1109/wi.2016.0014 article EN IEEE/WIC/ACM International Conference on Web Intelligence (WI'04) 2016-10-01

Open user modeling has been perceived as an important mechanism to enhance the effectiveness of personalization. However, several studies have reported that open and editable models can harm performance personalized search systems. This paper re-examines value in context search. We implemented a system with 2D manipulatable visualization concept-based model components. A study result suggests proposed visualization-based approach be beneficial for adaptive

10.1145/2678025.2701410 article EN 2015-03-18

Parsing the semantic structure of a web page is key component information extraction. Successful extraction algorithms usually require large-scale training and evaluation datasets, which are difficult to acquire. Recently, crowdsourcing has proven be an effective method collecting data in domains that do not much domain knowledge. For more complex domains, researchers have proposed sophisticated quality control mechanisms replicate tasks parallel or sequential ways then aggregate responses...

10.1145/2870649 article EN ACM Transactions on Intelligent Systems and Technology 2016-04-25
Coming Soon ...