- Complex Network Analysis Techniques
- Web Data Mining and Analysis
- Advanced Database Systems and Queries
- Data Management and Algorithms
- Topic Modeling
- Data Mining Algorithms and Applications
- Semantic Web and Ontologies
- Recommender Systems and Techniques
- Spam and Phishing Detection
- Advanced Graph Neural Networks
- Opinion Dynamics and Social Influence
- Text and Document Classification Technologies
- Human Mobility and Location-Based Analysis
- Advanced Text Analysis Techniques
- Natural Language Processing Techniques
- Geographic Information Systems Studies
- Service-Oriented Architecture and Web Services
- Caching and Content Delivery
- Data Quality and Management
- Social Media and Politics
- Sentiment Analysis and Opinion Mining
- Access Control and Trust
- Digital Marketing and Social Media
- Open Education and E-Learning
- Software Engineering Research
Singapore Management University
2016-2025
Osaka University
2022
Nanjing University
2022
Southeast University
2022
Southwest Jiaotong University
2022
Monash University
2014-2022
University of Minnesota
1993-2022
Arizona State University
2022
Korea Advanced Institute of Science and Technology
2022
Maebashi Institute of Technology
2022
This paper focuses on the problem of identifying influential users micro-blogging services. Twitter, one most notable services, employs a social-networking model called "following", in which each user can choose who she wants to "follow" receive tweets from without requiring latter give permission first. In dataset prepared for this study, it is observed that (1) 72.4% Twitter follow more than 80% their followers, and (2) 80.5% have they are following them back. Our study reveals presence...
This paper aims to detect users generating spam reviews or review spammers. We identify several characteristic behaviors of spammers and model these so as the In particular, we seek following behaviors. First, may target specific products product groups in order maximize their impact. Second, they tend deviate from other reviewers ratings products. propose scoring methods measure degree for each reviewer apply them on an Amazon dataset. then select a subset highly suspicious further scrutiny...
In recent years, opinion mining attracted a great deal of research attention. However, limited work has been done on detecting spam (or fake reviews). The problem is analogous to in Web search [1, 9 11]. review harder detect because it very hard, if not impossible, recognize reviews by manually reading them [2]. This paper deals with restricted problem, i.e., identifying unusual patterns which can represent suspicious behaviors reviewers. We formulate the as finding unexpected rules....
Twitter has become one of the largest microblogging platforms for users around world to share anything happening them with friends and beyond. A bursty topic in is that triggers a surge relevant tweets within short period time, which often reflects important events mass interest. How leverage early detection topics therefore an research problem immense practical value. Despite wealth work on modelling analysis Twitter, it remains challenge detect real-time. As existing methods can hardly...
Lei Wang, Wanyu Xu, Yihuai Lan, Zhiqiang Hu, Yunshi Roy Ka-Wei Lee, Ee-Peng Lim. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.
The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development numerous cost-effective accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e.g., ChatDoctor) or instruction Alpaca). Among various fine-tuning methods, adapter-based parameter-efficient (PEFT) is undoubtedly one most attractive topics, as it only requires a few external parameters instead entire while achieving comparable even better performance. To...
Advances in wireless technology increase the number of mobile device users and give pace to rapid development e-commerce using these devices. The new type e-commerce, conducting transactions via terminals, is called commerce. Due its inherent characteristics such as ubiquity, personalization, flexibility, dissemination, commerce promises businesses unprecedented market potential, great productivity, high profitability. This paper presents an overview by examining enabling technologies,...
Hierarchical classification refers to the assignment of one or more suitable categories from a hierarchical category space document. While previous work in focused on virtual trees where documents are assigned only leaf categories, we propose top-down level-based method that can classify both and internal categories. As standard performance measures assume independence between they have not considered incorrectly classified into similar far correct ones tree. We therefore category-similarity...
Wikipedia has grown to be the world largest and busiest free encyclopedia, in which articles are collaboratively written maintained by volunteers online. Despite its success as a means of knowledge sharing collaboration, public never stopped criticizing quality edited non-experts inexperienced contributors. In this paper, we investigate problem assessing collaborative authoring Wikipedia. We propose three article measurement models that make use interaction data between their contributors...
In web classification, pages from one or more sites are assigned to pre-defined categories according their content. Since than just plain text documents, classification methods have consider using other context features of pages, such as hyperlinks and HTML tags. this paper, we propose the use Support Vector Machine (SVM) classifiers classify both feature sets. We experimented our method on WebKB data set. Compared with earlier Foil-Pilfs same set, has been shown perform very well. also that...
We consider the problem of analyzing word trajectories in both time and frequency domains, with specific goal identifying important less-reported, periodic aperiodic words. A set words identical trends can be grouped together to reconstruct an event a completely un-supervised manner. The document each across is treated like series, where element - inverse (DFIDF) score at one point. In this paper, we 1) first applied spectral analysis categorize features for different characteristics:...
Business intelligence and analytics (BIA) is about the development of technologies, systems, practices, applications to analyze critical business data so as gain new insights markets. The can be used for improving products services, achieving better operational efficiency, fostering customer relationships. In this article, we will categorize BIA research activities into three broad directions: (a) big analytics, (b) text (c) network analytics. article aims review state-of-the-art techniques...
This paper studies the dynamic web service selection problem in a failure-prone environment, which aims to determine subset of Web services be invoked at run-time so as successfully orchestrate composite service. We observe that both and constituent often constrain sequences invoking their operations therefore propose use finite state machine model permitted invocation operations. assign each execution an aggregated reliability measure probability given will lead successful context where may...
Trust between a pair of users is an important piece information for in online community (such as electronic commerce websites and product review websites) where may rely on trust to make decisions. In this paper, we address the problem predicting whether user trusts another user. Most prior work infers unknown ratings from known ratings. The effectiveness approach depends connectivity web can be quite poor when very sparse which often case community. therefore propose classification...
User identity linkage across social platforms is an important problem of great research challenge and practical value. In real applications, the task often assumes extra degree difficulty by requiring multiple platforms. While pair-wise user between two platforms, which has been focus most existing solutions, provides reasonably convincing linkage, result depends nature on order platform pairs in execution with no theoretical guarantee its stability. this paper, we explore a new concept...
This study focuses on the uses of Twitter during elections, examining whether messages posted online are reflective climate public opinion. Using data obtained official campaign period 2011 Singapore General Election, we test predictive power tweets in forecasting election results. In line with some previous studies, find that elections sphere represents a rich source for gauging opinion and frequency mentioning names political parties, candidates contested constituencies could be used to...
Web query recommendation has long been considered a key feature of search engines. Building good system, however, is very difficult due to the fundamental challenge predicting users' intent, especially given limited user context information. In this paper, we propose novel "sequential prediction" approach that tries grasp user's intent based on his/her past sequence and its resemblance historical models mined from massive engine logs. Different were examined, including naive variable length...
Food computing is playing an increasingly important role in human daily life, and has found tremendous applications guiding behavior towards smart food consumption healthy lifestyle. An task under the food-computing umbrella retrieval, which particularly helpful for health related applications, where we are interested retrieving information about (e.g., ingredients, nutrition, etc.). In this paper, investigate open research of cross-modal retrieval between cooking recipes images, propose a...
Online Social networks have provided the infrastructure for a number of emerging applications in recent years, e.g., recommendation service providers or files as services. In these applications, trust is one most important factors decision making by consumer, requiring evaluation trustworthiness provider along social paths from consumer to provider. However, there are usually many between two participants who unknown another. addition, some information, such relationships and roles...