- Data Management and Algorithms
- Data Mining Algorithms and Applications
- Caching and Content Delivery
- Complex Network Analysis Techniques
- Distributed systems and fault tolerance
- Peer-to-Peer Network Technologies
- Advanced Database Systems and Queries
- Rough Sets and Fuzzy Logic
- Advanced Data Storage Technologies
- Optimization and Search Problems
- Advanced Clustering Algorithms Research
- Advanced Image and Video Retrieval Techniques
- Algorithms and Data Compression
- Opportunistic and Delay-Tolerant Networks
- Recommender Systems and Techniques
- Distributed and Parallel Computing Systems
- Time Series Analysis and Forecasting
- Parallel Computing and Optimization Techniques
- Privacy-Preserving Technologies in Data
- Wireless Networks and Protocols
- Cooperative Communication and Network Coding
- Cryptography and Data Security
- Interconnection Networks and Systems
- Image Retrieval and Classification Techniques
- Web Data Mining and Analysis
National Taiwan University
2016-2025
Research Center for Information Technology Innovation, Academia Sinica
2012-2023
Adrian College
2023
Directorate of Medicinal and Aromatic Plants Research
2023
Academia Sinica
2009-2022
Ruhr University Bochum
2021
Institute of Information Science, Academia Sinica
2004-2021
Weatherford College
2021
National Cheng Kung University
2006-2015
Center for Information Technology
2011-2015
Mining information and knowledge from large databases has been recognized by many researchers as a key research topic in database systems machine learning, industrial companies an important area with opportunity of major revenues. Researchers different fields have shown great interest data mining. Several emerging applications information-providing services, such warehousing online services over the Internet, also call for various mining techniques to better understand user behavior, improve...
In this paper, we examine the issue of mining association rules among items in a large database sales transactions. The can be mapped into problem discovering itemsets where itemset is group which appear sufficient number solved by constructing candidate set first and then, identifying, within set, those that meet requirement. Generally done iteratively for each k-itemset increasing order k with items. To determine from huge early iterations usually dominating factor overall data...
In this paper, we explore a new data mining capability that involves path traversal patterns in distributed information-providing environment where documents or objects are linked together to facilitate interactive access.Our solution procedure consists of two steps.First, derive an algorithm convert the original sequence log into set maximal forward references.By doing so, can filter out effect some backward references, which mainly made for ease traveling and concentrate on meaningful user...
In this paper, we examine the issue of mining association rules among items in a large database sales transactions. The can be mapped into problem discovering itemsets where itemset is group which appear sufficient number solved by constructing candidate set first and then, identifying, within set, those that meet requirement. Generally done iteratively for each k -itemset increasing order with items. To determine from huge early iterations usually dominating factor overall data performance....
We examine the issue of mining association rules among items in a large database sales transactions. Mining means that, given transactions, to discover all associations such that presence some transaction will imply other same transaction. The can be mapped into problem discovering itemsets where itemset is group appear sufficient number solved by constructing candidate set first, and then, identifying, within this set, these meet requirement. Generally, done iteratively for each k-itemset...
In this paper, we explore a new data mining capability which involved path traversal patterns in distributed information providing environment like world-wide-web. First, convert the original sequence of log into set maximal forward references and filter out effect some backward are mainly made for ease traveling. Second, derive algorithms to determine frequent patterns, i.e., large reference sequences, from obtained. Two devised determining sequences: one is based on hashing pruning...
A dynamic time-wavelength division multiaccess protocol (DT-WDMA) is proposed for metropolitan-sized multichannel optical networks employing fixed wavelength transmitters and tunable receivers. Control information sent over a dedicated signaling channel data are channels owned by the transmitters. Time divided into slots on each control further split mini-slots. Fixed time-division (TDM) used within slot channel. Transmitters indicate their intention to transmit packet transmitting...
The processor allocation problem in an n-dimensional hypercube (or n-cube) multiprocessor is similar to the conventional memory problem. main objective both problems maximize utilization of available resources as well minimize inherent system fragmentation. A strategy using buddy system, called strategy, discussed first and then a new Gray code (GC), GC proposed. When relinquishment not considered (i.e., static allocation), these strategies are proved be optimal sense that each incoming...
A family of six-regular graphs, called hexagonal meshes or H-meshes, is considered as a multiprocessor interconnection network. Processing nodes on the periphery an H-mesh are first wrapped around to achieve regularity and homogeneity. The diameter shown be O(p/sup 1/2/), where p number in H-mesh. An elegant, distributed routing scheme developed for H-meshes so that each node can compute shortest paths from itself any other with straightforward algorithm O(1) using addresses...
Article Free Access Share on Efficient parallel data mining for association rules Authors: Jong Soo Park IBM Thomas J. Watson Research Center, Yorktown Heights, New York YorkView Profile , Ming-Syan Chen Philip S. Yu Authors Info & Claims CIKM '95: Proceedings of the fourth international conference Information and knowledge managementDecember 1995Pages 31–36https://doi.org/10.1145/221270.221320Published:02 December 1995Publication History 136citation1,714DownloadsMetricsTotal...
The support vector machine (SVM) is a widely used tool in classification problems. SVM trains classifier by solving an optimization problem to decide which instances of the training data set are vectors, necessarily informative form classifier. Since vectors intact tuples taken from set, releasing for public use or shipping clients will disclose private content vectors. This violates privacy-preserving requirements some legal commercial reasons. that learned inherently privacy. privacy...
Video event detection allows intelligent indexing of video content based on events. Traditional approaches extract features from frames or shots, then quantize and pool the to form a single vector representation for entire video. Though simple efficient, final pooling step may lead loss temporally local information, which is important in indicating part long signifies presence event. In this work, we propose novel instance-based approach. We represent each as multiple 'instances', defined...
Challenges faced in organizing impromptu activities are the requirements of making timely invitations accordance with locations candidate attendees and social relationship among them. It is desirable to find a group close rally point ensure that selected have good create atmosphere activity. Therefore, this paper proposes Socio-Spatial Group Query (SSGQ) select nearby tight relation. Efficient processing SSGQ very challenging due tradeoff spatial domains. We show problem NP-hard via proof...
We explore in this paper an effective sliding-window filtering (abbreviatedly as SWF) algorithm for incremental mining of association rules. In essence, by partitioning a transaction database into several partitions, SWF employs threshold each partition to deal with the candidate itemset generation. Under SWF, cumulative information previous partitions is selectively carried over toward generation itemsets subsequent partitions. Algorithm not only significantly reduces I/O and CPU cost...
Using depth-first search, the authors develop and analyze performance of a routing scheme for hypercube multicomputers in presence an arbitrary number faulty components. They derive exact expression probability messages by way optimal paths (of length equal to Hamming distance between corresponding pair nodes) from source node obstructed node. The is defined as first encountered message that finds no path destination It noted over any two nodes special case present results can be obtained...
Caching can reduce the bandwidth requirement in a mobile computing environment. However, due to battery power limitations, wireless computer may often be forced operate doze (or even totally disconnected) mode. As result, miss some cache invalidation reports broadcast by server, forcing it discard entire contents after waking up. In this paper, we present an energy-efficient method, called GCORE (Grouping with COld update-set REtention), that allows disconnected mode save while still...
A connected hypercube with faulty links and/or nodes is called an injured hypercube. distributed adaptive fault-tolerant routing scheme proposed for in which each node required to know only the condition of its own links. Despite simplicity, this shown be capable messages successfully n-dimensional as long number components less than n. Moreover, it proved that routes via shortest paths a rather high probability, and expected length resulting path very close so path. Since assumption n might...
Web caching and prefetching are two important techniques used to reduce the noticeable response time perceived by users. Note that integrating prefetching, these can complement each other since technique exploits temporal locality, whereas utilizes spatial locality of objects. However, without circumspect design, integration might cause significant performance degradation other. In view this, we propose in this paper an innovative cache replacement algorithm, which not only considers effect...
Due to the rich information in graph data, technique for privacy protection published social networks is still its infancy, as compared relational databases. In this paper we identify a new type of attack called friendship attack. attack, an adversary utilizes degrees two vertices connected by edge re-identify related victims network data set. To protect against such attacks, introduce concept k2-degree anonymity, which limits probability vertex being re-identified 1/k. For anonymization...