Wei‐Ying Ma

ORCID: 0000-0002-7384-0735
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Web Data Mining and Analysis
  • Video Analysis and Summarization
  • Text and Document Classification Technologies
  • Data Management and Algorithms
  • Caching and Content Delivery
  • Complex Network Analysis Techniques
  • Topic Modeling
  • Face and Expression Recognition
  • Information Retrieval and Search Behavior
  • Multimedia Communication and Technology
  • Multimodal Machine Learning Applications
  • Geographic Information Systems Studies
  • Advanced Text Analysis Techniques
  • Advanced Clustering Algorithms Research
  • Recommender Systems and Techniques
  • Web visibility and informetrics
  • Image and Video Quality Assessment
  • Algorithms and Data Compression
  • Human Mobility and Location-Based Analysis
  • Computational Drug Discovery Methods
  • Music and Audio Processing
  • Visual Attention and Saliency Detection
  • Spam and Phishing Detection

Tsinghua University
2005-2024

Southwest Minzu University
2024

State Ethnic Affairs Commission
2024

Microsoft Research Asia (China)
2006-2017

Microsoft (United States)
2006-2017

Xian Yang Central Hospital
2016

Microsoft Research (United Kingdom)
2003-2016

Kashi University
2016

Shenyang University of Technology
2015

Japan External Trade Organization
2008

The increasing availability of GPS-enabled devices is changing the way people interact with Web, and brings us a large amount GPS trajectories representing people's location histories. In this paper, based on multiple users' trajectories, we aim to mine interesting locations classical travel sequences in given geospatial region. Here, mean culturally important places, such as Tiananmen Square Beijing, frequented public areas, like shopping malls restaurants, etc. Such information can help...

10.1145/1526709.1526816 article EN 2009-04-20

Among different recommendation techniques, collaborative filtering usually suffer from limited performance due to the sparsity of user-item interactions. To address issues, auxiliary information is used boost performance. Due rapid collection on web, knowledge base provides heterogeneous including both structured and unstructured data with semantics, which can be consumed by various applications. In this paper, we investigate how leverage in a improve quality recommender systems. First,...

10.1145/2939672.2939673 article EN 2016-08-08

Both recognizing human behavior and understanding a user's mobility from sensor data are critical issues in ubiquitous computing systems. As kind of user behavior, the transportation modes, such as walking, driving, etc., that takes, can enrich with informative knowledge provide pervasive systems more context information. In this paper, we propose an approach based on supervised learning to infer people's motion modes their GPS logs. The contribution work lies following two aspects. On one...

10.1145/1409635.1409677 article EN 2008-09-21

The pervasiveness of location-acquisition technologies (GPS, GSM networks, etc.) enable people to conveniently log the location histories they visited with spatio-temporal data. increasing availability large amounts data pertaining an individual's trajectories has given rise a variety geographic information systems, and also brings us opportunities challenges automatically discover valuable knowledge from these trajectories. In this paper, we move towards direction aim geographically mine...

10.1145/1463434.1463477 article EN 2008-11-05

While neural machine translation (NMT) is making good progress in the past two years, tens of millions bilingual sentence pairs are needed for its training. However, human labeling very costly. To tackle this training data bottleneck, we develop a dual-learning mechanism, which can enable an NMT system to automatically learn from unlabeled through game. This mechanism inspired by following observation: any task has dual task, e.g., English-to-French (primal) versus French-to-English (dual);...

10.48550/arxiv.1611.00179 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Organizing Web search results into clusters facilitates users' quick browsing through results. Traditional clustering techniques are inadequate since they don't generate with highly readable names. In this paper, we reformalize the problem as a salient phrase ranking problem. Given query and ranked list of documents (typically titles snippets) returned by certain engine, our method first extracts ranks phrases candidate cluster names, based on regression model learned from human labeled...

10.1145/1008992.1009030 article EN 2004-07-25

The increasing availability of location-acquisition technologies (GPS, GSM networks, etc.) enables people to log the location histories with spatio-temporal data. Such real-world imply, some extent, users' interests in places, and bring us opportunities understand correlation between users locations. In this article, we move towards direction report on a personalized friend recommender for geographical information systems (GIS) Web. First, system, particular individual's visits geospatial...

10.1145/1921591.1921596 article EN ACM Transactions on the Web 2011-02-01

User mobility has given rise to a variety of Web applications, in which the global positioning system (GPS) plays many important roles bridging between these applications and end users. As kind human behavior, transportation modes, such as walking driving, can provide pervasive computing systems with more contextual information enrich user's informative knowledge. In this article, we report on an approach based supervised learning automatically infer users' including walking, taking bus...

10.1145/1658373.1658374 article EN ACM Transactions on the Web 2010-01-01

Query expansion has long been suggested as an effective way to resolve the short query and word mismatching problems. A number of methods have proposed in traditional information retrieval. However, these previous do not take into account specific characteristics web searching; particular, availability large amount user interaction recorded logs. In this study, we propose a new method for based on The central idea is extract probabilistic correlations between terms document by analyzing...

10.1145/511446.511489 article EN 2002-05-07

We consider incorporating topic information into the sequence-to-sequence framework to generate informative and interesting responses for chatbots. To this end, we propose a aware (TA-Seq2Seq) model. The model utilizes topics simulate prior knowledge of human that guides them form in conversation, leverages generation by joint attention mechanism biased probability. summarizes hidden vectors an input message as context attention, synthesizes from words obtained pre-trained LDA model, let...

10.48550/arxiv.1606.08340 preprint EN other-oa arXiv (Cornell University) 2016-01-01

We consider incorporating topic information into a sequence-to-sequence framework to generate informative and interesting responses for chatbots. To this end, we propose aware (TA-Seq2Seq) model. The model utilizes topics simulate prior human knowledge that guides them form in conversation, leverages generation by joint attention mechanism biased probability. summarizes the hidden vectors of an input message as context synthesizes from words obtained pre-trained LDA model, with these jointly...

10.1609/aaai.v31i1.10981 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2017-02-12

Relevant Component Analysis (RCA) has been proposed for learning distance metrics with contextual constraints image retrieval. However, RCA two important disadvantages. One is the lack of exploiting negative which can also be informative, and other its incapability capturing complex nonlinear relationships between data instances information. In this paper, we propose algorithms to overcome these disadvantages, i.e., Discriminative (DCA) Kernel DCA. Compared complicated methods metric...

10.1109/cvpr.2006.167 article EN 2006-07-10

This paper introduces the Attribute-Decomposed GAN, a novel generative model for controllable person image synthesis, which can produce realistic images with desired human attributes (e.g., pose, head, upper clothes and pants) provided in various source inputs. The core idea of proposed is to embed into latent space as independent codes thus achieve flexible continuous control via mixing interpolation operations explicit style representations. Specifically, new architecture consisting two...

10.1109/cvpr42600.2020.00513 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

A novel boundary detection scheme based on "edge flow" is proposed in this paper. This utilizes a predictive coding model to identify the direction of change color and texture at each image location given scale, constructs an edge flow vector. By propagating vectors, boundaries can be detected locations which encounter two opposite directions stable state. user defined scale only significant control parameter that needed by algorithm. The facilitates integration into single framework for...

10.1109/83.855433 article EN IEEE Transactions on Image Processing 2000-01-01

We consider the problem of clustering Web image search results. Generally, results returned by an engine contain multiple topics. Organizing into different semantic clusters facilitates users' browsing. In this paper, we propose a hierarchical method using visual, textual and link analysis. By vision-based page segmentation algorithm, web is partitioned blocks, information can be accurately extracted from block containing that image. block-level analysis techniques, graph constructed. then...

10.1145/1027527.1027747 article EN 2004-10-10

Queries to search engines on the Web are usually short. They do not provide sufficient information for an effective selection of relevant documents. Previous research has proposed utilization query expansion deal with this problem. However, terms determined term co-occurrences within In study, we propose a new method based user interactions recorded in logs. The central idea is extract correlations between and document by analyzing These then used select high-quality queries. Compared...

10.1109/tkde.2003.1209002 article EN IEEE Transactions on Knowledge and Data Engineering 2003-07-01

Although it has been studied for several years by computer vision and machine learning communities, image annotation is still far from practical. In this paper, we present AnnoSearch, a novel way to annotate images using search data mining technologies. Leveraging the Web-scale images, solve problem in two-steps: 1) searching semantically visually similar on Web, 2) annotations them. Firstly, at least one accurate keyword required enable text-based set of images. Then content-based performed...

10.1109/cvpr.2006.58 article EN 2006-07-10

Previous work shows that a web page can be partitioned into multiple segments or blocks, and often the importance of those blocks in is not equivalent. Also, it has been proven differentiating noisy unimportant from pages facilitate mining, search accessibility. However, no uniform approach model presented to measure different pages. Through user study, we found people do have consistent view about In this paper, investigate how find automatically assign values page. We define block...

10.1145/988672.988700 article EN 2004-05-17
Coming Soon ...