Kian‐Lee Tan

ORCID: 0000-0001-9315-4057
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Data Management and Algorithms
  • Advanced Database Systems and Queries
  • Peer-to-Peer Network Technologies
  • Caching and Content Delivery
  • Advanced Data Storage Technologies
  • Data Mining Algorithms and Applications
  • Distributed systems and fault tolerance
  • Advanced Image and Video Retrieval Techniques
  • Cloud Computing and Resource Management
  • Algorithms and Data Compression
  • Graph Theory and Algorithms
  • Human Mobility and Location-Based Analysis
  • Distributed and Parallel Computing Systems
  • Cryptography and Data Security
  • Privacy-Preserving Technologies in Data
  • Image Retrieval and Classification Techniques
  • Mobile Agent-Based Network Management
  • Geographic Information Systems Studies
  • Complex Network Analysis Techniques
  • Optimization and Search Problems
  • Web Data Mining and Analysis
  • Recommender Systems and Techniques
  • Semantic Web and Ontologies
  • Internet Traffic Analysis and Secure E-voting
  • Data Stream Mining Techniques

University of Manchester
2023-2025

National University of Singapore
2015-2024

Singapore Management University
2018

Duke-NUS Medical School
2016

Universiti Tunku Abdul Rahman
2016

UNSW Sydney
2012

University of California, Santa Barbara
2011

Singapore-MIT Alliance for Research and Technology
2005-2011

Singapore General Hospital
1987-2009

University of Michigan
2006-2007

Mobile devices equipped with positioning capabilities (e.g., GPS) can ask location-dependent queries to Location Based Services (LBS). To protect privacy, the user location must not be disclosed. Existing solutions utilize a trusted anonymizer between users and LBS. This approach has several drawbacks: (i) All trust third party anonymizer, which is single point of attack. (ii) A large number cooperating, trustworthy needed. (iii) Privacy guaranteed only for snapshot locations; are protected...

10.1145/1376616.1376631 article EN 2008-06-09

Blockchain technologies are taking the world by storm. Public blockchains, such as Bitcoin and Ethereum, enable secure peer-to-peer applications like crypto-currency or smart contracts. Their security performance well studied. This paper concerns recent private blockchain systems designed with stronger (trust) assumption requirement. These target aim to disrupt which have so far been implemented on top of database systems, for example banking, finance trading applications. Multiple platforms...

10.1145/3035918.3064033 article EN 2017-05-09

In this article, we present an efficient B + -tree based indexing method, called iDistance, for K-nearest neighbor (KNN) search in a high-dimensional metric space. iDistance partitions the data on space- or data-partitioning strategy, and selects reference point each partition. The points partition are transformed into single dimensional value their similarity with respect to point. This allows be indexed using structure KNN performed one-dimensional range search. choice of adapts index...

10.1145/1071610.1071612 article EN ACM Transactions on Database Systems 2005-06-01

Influence Maximization (IM), which selects a set of k users (called seed set) from social network to maximize the expected number influenced influence spread), is key algorithmic problem in analysis. Due its immense application potential and enormous technical challenges, IM has been extensively studied past decade. In this paper, we survey synthesize wide spectrum existing studies on an perspective, with special focus following aspects: (1) review well-accepted diffusion models that capture...

10.1109/tkde.2018.2807843 article EN IEEE Transactions on Knowledge and Data Engineering 2018-02-22

Given a d-dimensional data set, point p dominates another q if it is better than or equal to in all dimensions and at least one dimension. A skyline there does not exists any that can dominate it. Skyline queries, which return points, are useful many decision making applications.Unfortunately, as the number of increases, chance dominating very low. As such, points become too numerous offer interesting insights. To find more important meaningful high dimensional space, we propose new concept,...

10.1145/1142473.1142530 article EN 2006-06-27

Growing main memory capacity has fueled the development of in-memory big data management and processing. By eliminating disk I/O bottleneck, it is now possible to support interactive analytics. However, systems are much more sensitive other sources overhead that do not matter in traditional I/O-bounded disk-based systems. Some issues such as fault-tolerance consistency also challenging handle environment. We witnessing a revolution design database exploits its storage layer. Many these...

10.1109/tkde.2015.2427795 article EN IEEE Transactions on Knowledge and Data Engineering 2015-04-29

Influence maximization, whose objective is to select k users (called seeds) from a social network such that the number of influenced by seeds influence spread) maximized, has attracted significant attention due its widespread applications, as viral marketing and rumor control. However, in real-world networks, have their own interests (which can be represented topics) are more likely friends (or friends' friends) with similar topics. We increase spread taking into consideration To address...

10.14778/2735703.2735706 article EN Proceedings of the VLDB Endowment 2015-02-01

Blockchain technologies are taking the world by storm. Public blockchains, such as Bitcoin and Ethereum, enable secure peer-to-peer applications like crypto-currency or smart contracts. Their security performance well studied. This paper concerns recent private blockchain systems designed with stronger (trust) assumption requirement. These target aim to disrupt which have so far been implemented on top of database systems, for example banking, finance applications. Multiple platforms...

10.48550/arxiv.1703.04057 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Crowdsourcing is widely accepted as a means for resolving tasks that machines are not good at. Unfortunately, may yield relatively low-quality results if there no proper quality control. Although previous studies attempt to eliminate "bad" workers by using qualification tests, the accuracies estimated from qualifications be accurate, because have diverse across tasks. Thus, of could further improved selectively assigning who well acquainted with To this end, we propose an adaptive...

10.1145/2723372.2750550 article EN 2015-05-27

We present the design and evaluation of PeerDB, a peer-to-peer (P2P) distributed data sharing system. PeerDB distinguishes itself from existing P2P systems in several ways. First, it is full-fledge management system that supports fine-grain content-based searching. Second, facilitates without shared schema. Third, combines power mobile agents into to perform operations at peers' sites. Fourth, network self-configurable, i.e., node can dynamically optimize set peers communicate directly with...

10.1109/icde.2003.1260827 article EN 2004-05-06

In data publishing, the owner delegates role of satisfying user queries to a third-party publisher. As publisher may be untrusted or susceptible attacks, it could produce incorrect query results. this paper, we introduce scheme for users verify that their results are complete (i.e., no qualifying tuples omitted) and authentic all result values originated from owner). The supports range selection on key non-key attributes, project as well join relational databases. Moreover, proposed complies...

10.1145/1066157.1066204 article EN 2005-06-14

Edge computing pushes application logic and the underlying data to edge of network, with aim improving availability scalability. As servers are not necessarily secure, there must be provisions for validating their outputs. This paper proposes a mechanism that creates verification object (VO) checking integrity each query result produced by an server - values in tuples tampered with, no spurious introduced. The primary advantages our proposed VO is independent database size, relational...

10.1109/icde.2004.1320027 article EN 2004-09-28

users in a social network to maximize the expected number of influenced by selected (called influence spread), has been extensively studied, existing works neglected fact that location information can play an important role maximization. Many real-world applications such as location-aware word-of-mouth marketing have requirement. In this paper we study maximization problem. One big challenge is develop efficient scheme offers wide spread. To address challenge, propose two greedy algorithms...

10.1145/2588555.2588561 article EN 2014-06-18

Advertising in social network has become a multi-billion-dollar industry. A main challenge is to identify key influencers who can effectively contribute the dissemination of information. Although influence maximization problem, which finds seed set k most influential users based on certain propagation models, been well studied, it not target-aware and cannot be directly applied online advertising. In this paper, we propose new named Keyword-Based Targeted Influence Maximization (KB-TIM),...

10.14778/2794367.2794376 article EN Proceedings of the VLDB Endowment 2015-06-01

Information networks, such as social media and email often contain sensitive information. Releasing network data could seriously jeopardize individual privacy. Therefore, we need to sanitize before the release. In this paper, present a novel sanitization solution that infers network's structure in differentially private manner. We observe that, by estimating connection probabilities between vertices instead of considering observed edges directly, noise scale enforced differential privacy can...

10.1145/2623330.2623642 article EN 2014-08-22

In the recent decades, we have witnessed rapidly growing popularity of location-based systems. Three types queries on road networks, single-pair shortest path query, k nearest neighbor (kNN) and keyword-based kNN are widely used in Inspired by R-tree, propose a height-balanced scalable index, namely G-tree, to efficiently support these queries. The space complexity G-tree is O(|V|log|V|) where |V| number vertices network. Unlike previous works that separately, supports all within one...

10.1109/tkde.2015.2399306 article EN IEEE Transactions on Knowledge and Data Engineering 2015-02-03

Massive amount of data that are geo-tagged and associated with text information being generated at an unprecedented scale. These geo-textual cover a wide range topics. Users interested in receiving up-to-date tweets such their locations close to user specified location texts interesting users. For example, may want be updated near her home on the topic "food poisoning vomiting." We consider Temporal Spatial-Keyword Top-k Subscription (TaSK) query. Given TaSK query, we continuously maintain...

10.1109/icde.2015.7113289 article EN 2015-04-01

Deep learning has recently become very popular on account of its incredible success in many complex datadriven applications, including image classification and speech recognition. The database community worked data-driven applications for years, therefore should be playing a lead role supporting this new wave. However, databases deep are different terms both techniques applications. In paper, we discuss research problems at the intersection two fields. particular, possible improvements...

10.1145/3003665.3003669 article EN ACM SIGMOD Record 2016-09-28

In this big data era, huge amounts of spatial documents have been generated everyday through various location based services. Top-k keyword search is an important approach to exploring useful information from a database. It retrieves k on ranking function that takes into account both textual relevance (similarity between the query and document keywords) (distance locations). Various hybrid indexes proposed in recent years which mainly combine R-tree inverted index so pruning can be executed...

10.1145/2452376.2452419 article EN 2013-03-18

Deep learning has shown outstanding performance in various machine tasks. However, the deep complex model structure and massive training data make it expensive to train. In this paper, we present a distributed system, called SINGA, for big models over large datasets. An intuitive programming based on layer abstraction is provided, which supports variety of popular models. SINGA architecture both synchronous asynchronous frameworks. Hybrid frameworks can also be customized achieve good...

10.1145/2733373.2807410 article EN 2015-10-13

We introduce ChronoStream, a distributed system specifically designed for elastic stateful stream computation in the cloud. ChronoStream treats internal state as first-class citizen and aims at providing flexible support both vertical horizontal dimensions to cope with workload fluctuation dynamic resource reclamation. With clear separation between application-level parallelism OS-level execution concurrency, enables transparent scaling failure recovery by eliminating any network I/O...

10.1109/icde.2015.7113328 article EN 2015-04-01
Coming Soon ...