NFDI4DS | UHH-SEMS - Publication Details

Wei Chu

ORCID: 0000-0002-4595-388X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101906116

Research Areas

Pleistocene-Era Hominins and Archaeology
Ancient and Medieval Archaeology Studies
Geology and Paleoclimatology Research
Speech Recognition and Synthesis
Archaeology and ancient environmental studies
Music and Audio Processing
Advanced Bandit Algorithms Research
Image Processing and 3D Reconstruction
Speech and Audio Processing
Recommender Systems and Techniques
Marine and environmental studies
Natural Language Processing Techniques
Geological Formations and Processes Exploration
Information Retrieval and Search Behavior
Web Data Mining and Analysis
Machine Learning and Algorithms
Topic Modeling
Protein Structure and Dynamics
Expert finding and Q&A systems
Optimization and Search Problems
Machine Learning in Bioinformatics
Speech and dialogue systems
Text and Document Classification Technologies
Forensic Anthropology and Bioarchaeology Studies
Multimodal Machine Learning Applications

Leiden University
2009-2024

University of Cologne
2014-2021

National Yang Ming Chiao Tung University
2021

Deutsches Archäologisches Institut, Zentrale
2021

Snap (United States)
2019

Zhejiang Financial College
2018

Alibaba Group (China)
2017-2018

Alibaba Group (United States)
2017

Microsoft (United States)
2011-2015

University of Reading
2012-2013

A contextual-bandit approach to personalized news article recommendation

OPENALEX - Publications

Lihong Li Wei Chu John Langford Robert E. Schapire

Personalized web services strive to adapt their (advertisements, news articles, etc) individual users by making use of both content and user information. Despite a few recent advances, this problem remains challenging for at least two reasons. First, service is featured with dynamically changing pools content, rendering traditional collaborative filtering methods inapplicable. Second, the scale most practical interest calls solutions that are fast in learning computation. In work, we model...

10.1145/1772690.1772758 preprint EN 2010-04-26

Application of the topographic position index to heterogeneous landscapes

OPENALEX - Publications

Jeroen De Reu Jean Bourgeois Machteld Bats Ann Zwertvaegher Vanessa Gelorini and 8 more

10.1016/j.geomorph.2012.12.015 article EN Geomorphology 2012-12-22

Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms

OPENALEX - Publications

Lihong Li Wei Chu John Langford Xuanhui Wang

Contextual bandit algorithms have become popular for online recommendation systems such as Digg, Yahoo! Buzz, and news in general. \emph{Offline} evaluation of the effectiveness new these applications is critical protecting user experiences but very challenging due to their "partial-label" nature. Common practice create a simulator which simulates environment problem at hand then run an algorithm against this simulator. However, creating itself often difficult modeling bias usually...

10.1145/1935826.1935878 preprint EN 2011-02-01

Pairwise preference regression for cold-start recommendation

OPENALEX - Publications

Seung-Taek Park Wei Chu

Recommender systems are widely used in online e-commerce applications to improve user engagement and then increase revenue. A key challenge for recommender is providing high quality recommendation users ``cold-start" situations. We consider three types of cold-start problems: 1) on existing items new users; 2) 3) users. propose predictive feature-based regression models that leverage all available information items, such as demographic item content features, tackle problems. The resulting...

10.1145/1639714.1639720 article EN 2009-10-23

Modeling the impact of short- and long-term behavior on search personalization

OPENALEX - Publications

Paul N. Bennett Ryen W. White Wei Chu Susan Dumais Peter Bailey and 2 more

User behavior provides many cues to improve the relevance of search results through personalization. One aspect user that especially strong signals for delivering better is an individual's history queries and clicked documents. Previous studies have explored how short-term or long-term can be predictive relevance. Ours first study assess (session) (historic) interact, each may used in isolation combination optimally contribute gains Our key findings include: historic substantial benefits at...

10.1145/2348283.2348312 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2012-08-12

AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine

OPENALEX - Publications

Minghui Qiu Feng-Lin Li Siyu Wang Xing Gao Yan Chen and 4 more

Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Jun Huang, Wei Chu. Proceedings of the 55th Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2017.

10.18653/v1/p17-2079 article EN cc-by 2017-01-01

Personalized recommendation on dynamic content using predictive bilinear models

OPENALEX - Publications

Wei Chu Seung-Taek Park

In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty timely identifying new items high-quality and providing recommendations for users. We propose a feature-based machine learning approach to personalized recommendation that is capable handling cold-start issue effectively. maintain profiles interest, in which temporal characteristics content, e.g. popularity freshness, are updated real-time manner. also users including demographic...

10.1145/1526709.1526802 article EN 2009-04-20

Enhancing personalized search by mining and modeling task behavior

OPENALEX - Publications

Ryen W. White Wei Chu Ahmed H. Yousef Xiaodong He Yang Song and 1 more

Personalized search systems tailor results to the current user intent using historic interactions. This relies on being able find pertinent information in that user's history, which can be challenging for unseen queries and new scenarios. Building richer models of users' tasks help improve likelihood finding relevant content enhance relevance coverage personalization methods. The task-based approach applied or as we focus here, all histories so-called "groupization" (a variant whereby other...

10.1145/2488388.2488511 article EN 2013-05-13

A Short-Term Rainfall Prediction Model Using Multi-task Convolutional Neural Networks

OPENALEX - Publications

Minghui Qiu Peilin Zhao Ke Zhang Jun Huang Xing Shi and 2 more

Precipitation prediction, such as short-term rainfall is a very important problem in the field of meteorological service. In practice, most recent studies focus on leveraging radar data or satellite images to make predictions. However, there another scenario where set weather features are collected by various sensors at multiple observation sites. The observations site sometimes incomplete but provide clues for prediction nearby sites, which not fully exploited existing work yet. To solve...

10.1109/icdm.2017.49 article EN 2021 IEEE International Conference on Data Mining (ICDM) 2017-11-01

Modelling Domain Relationships for Transfer Learning on Retrieval-based Question Answering Systems in E-commerce

OPENALEX - Publications

Jianfei Yu Minghui Qiu Jing Jiang Jun Huang Shuangyong Song and 2 more

Nowadays, it is a heated topic for many industries to build automatic question-answering (QA) systems. A key solution these QA systems retrieve from knowledge base the most similar question of given question, which can be reformulated as paraphrase identification (PI) or natural language inference (NLI) problem. However, existing models PI and NLI have at least two problems: They rely on large amount labeled data, not always available in real scenarios, they may efficient industrial...

10.1145/3159652.3159685 article EN 2018-02-02

ESCM2

OPENALEX - Publications

Hao Wang Tai-Wei Chang Tianqiao Liu Jianmin Huang Zhichao Chen and 3 more

Accurate estimation of post-click conversion rate is critical for building recommender systems, which has long been confronted with sample selection bias and data sparsity issues. Methods in the Entire Space Multi-task Model (ESMM) family leverage sequential pattern user actions, i.e. $impression\rightarrow click \rightarrow conversion$ to address issue. However, they still fail ensure unbiasedness CVR estimates. In this paper, we theoretically demonstrate that ESMM suffers from following...

10.1145/3477495.3531972 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

Learning to extract cross-session search tasks

OPENALEX - Publications

Hongning Wang Yang Song Ming‐Wei Chang Xiaodong He Ryen W. White and 1 more

Search tasks, comprising a series of search queries serving the same information need, have recently been recognized as an accurate atomic unit for modeling user intent. Most prior research in this area has focused on short-term tasks within single session, and heavily depend human annotations supervised classification model learning. In work, we target identification long-term, or cross-session, (transcending session boundaries) by investigating inter-query dependencies learned from users'...

10.1145/2488388.2488507 article EN 2013-05-13

Personalized ranking model adaptation for web search

OPENALEX - Publications

Hongning Wang Xiaodong He Ming‐Wei Chang Yang Song Ryen W. White and 1 more

Search engines train and apply a single ranking model across all users, but searchers' information needs are diverse cover broad range of topics. Hence, user-independent is insufficient to satisfy different users' result preferences. Conventional personalization methods learn separate models user interests use those re-rank the results from generic model. Those require significant history preferences, have low coverage in case memory-based that direct associations between query-URL pairs,...

10.1145/2484028.2484068 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2013-07-28

Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy

OPENALEX - Publications

Xiexiong Lin Weiyu Jian Jianshan He Taifeng Wang Wei Chu

Knowledge-driven conversation approaches have achieved remarkable research attention recently. However, generating an informative response with multiple relevant knowledge without losing fluency and coherence is still one of the main challenges. To address this issue, paper proposes a method that uses recurrent interaction among decoding steps to incorporate appropriate knowledge. Furthermore, we introduce copy mechanism using knowledge-aware pointer network words from external according...

10.18653/v1/2020.acl-main.6 article EN 2020-01-01

DYNAMICS OF LEARNING IN NEANDERTHALS AND MODERN HUMANS

OPENALEX - Publications

Wei Chu

10.1179/0197726114z.00000000045 article EN Lithic Technology 2014-10-09

The Danube Corridor Hypothesis and the Carpathian Basin: Geological, Environmental and Archaeological Approaches to Characterizing Aurignacian Dynamics

OPENALEX - Publications

Wei Chu

Early Upper Paleolithic sites in the Danube catchment have been put forward as evidence that river was an important conduit for modern humans during their initial settlement of Europe. Central to this model is Carpathian Basin, a region covering most Middle Danube. As archaeological record still poorly understood, paper aims provide contextual assessment Basin's geological and paleoenvironmental archives, starting with late Pleistocene. Subsequently, it compiles early data from synchronic...

10.1007/s10963-018-9115-1 article EN cc-by Journal of World Prehistory 2018-05-29

The Crvenka loess-paleosol sequence: A record of continuous grassland domination in the southern Carpathian Basin during the Late Pleistocene

OPENALEX - Publications

Slobodan B. Marković Pál Sümegi Thomas Stevens Randall J. Schaetzl Igor Obreht and 9 more

10.1016/j.palaeo.2018.03.019 article EN Palaeogeography Palaeoclimatology Palaeoecology 2018-03-21

An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis

OPENALEX - Publications

Yingchen Wei Xihe Qiu Xiaoyu Tan Jingjing Huang Wei Chu and 2 more

10.1109/icassp49660.2025.10888243 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

A case study of behavior-driven conjoint analysis on Yahoo!

OPENALEX - Publications

Wei Chu Seung-Taek Park Todd Beaupre Nitin Motgi Amit Phadke and 2 more

Conjoint analysis is one of the most popular market research methodologies for assessing how customers with heterogeneous preferences appraise various objective characteristics in products or services, which provides critical inputs many marketing decisions, e.g. optimal design new and target selection. Nowadays it becomes practical e-commercial applications to collect millions samples quickly. However, large-scale data sets make traditional conjoint coupled sophisticated Monte Carlo...

10.1145/1557019.1557138 article EN 2009-06-28

Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units

OPENALEX - Publications

Zhangyu Xiao Zhijian Ou Wei Chu Hui Lin

In this paper, we present an end-to-end automatic speech recognition system, which successfully employs subword units in a hybrid CTC-Attention based system. The are obtained by the byte-pair encoding (BPE) compression algorithm. Compared to using words as modeling units, characters or does not suffer from out-of-vocabulary (OOV) problem. Furthermore, further offers capability longer context than characters. We evaluate different systems over LibriSpeech 1000h dataset. subword-based system...

10.1109/iscslp.2018.8706675 article EN 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP) 2018-11-01

CASS-NAT: CTC Alignment-Based Single Step Non-Autoregressive Transformer for Speech Recognition

OPENALEX - Publications

Ruchao Fan Wei Chu Peng Chang Jing Xiao

We propose a CTC alignment-based single step non-autoregressive transformer (CASS-NAT) for speech recognition. Specifically, the alignment contains information of (a) number tokens decoder input, and (b) time span acoustics each token. The are used to extract acoustic representation token in parallel, referred as token-level embedding which substitutes word autoregressive (AT) achieve parallel generation decoder. During inference, an error-based sampling method is proposed be applied output...

10.1109/icassp39728.2021.9413429 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Singing Voice Conversion with Non-parallel Data

OPENALEX - Publications

Xin Chen Wei Chu Jinxi Guo Ning Xu

Singing voice conversion is a task to convert song sang by source singer the of target singer. In this paper, we propose using parallel data free, many-to-one technique on singing voices. A phonetic posterior feature first generated decoding voices through robust Automatic Speech Recognition Engine (ASR). Then, trained Recurrent Neural Network (RNN) with Deep Bidirectional Long Short Term Memory (DBLSTM) structure used model mapping from person-independent content acoustic features person....

10.1109/mipr.2019.00059 preprint EN 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) 2019-03-01

Learning to Recommend Related Entities to Search Users

OPENALEX - Publications

Bin Bi Hao Ma Bo-June Hsu Wei Chu Kuansan Wang and 1 more

Over the past few years, major web search engines have introduced knowledge bases to offer popular facts about people, places, and things on entity pane next regular results. In addition information searched by user, often provides a ranked list of related entities. To keep users engaged, it is important develop recommendation model that tailors entities individual user interests. We propose probabilistic Three-way Entity Model (TEM) personalized using three data sources: base, click log,...

10.1145/2684822.2685304 article EN 2015-01-28

Coming Soon ...