- Data Mining Algorithms and Applications
- Face and Expression Recognition
- Machine Learning and Data Classification
- Text and Document Classification Technologies
- Complex Network Analysis Techniques
- Rough Sets and Fuzzy Logic
- Advanced Clustering Algorithms Research
- Imbalanced Data Classification Techniques
- Data Management and Algorithms
- Network Security and Intrusion Detection
- Web Data Mining and Analysis
- Data Stream Mining Techniques
- Advanced Graph Neural Networks
- Software Reliability and Analysis Research
- Software System Performance and Reliability
- Opinion Dynamics and Social Influence
- Business Process Modeling and Analysis
- Algorithms and Data Compression
- Service-Oriented Architecture and Web Services
- Video Analysis and Summarization
- Gene expression and cancer classification
- Advanced Image and Video Retrieval Techniques
- Grey System Theory Applications
- Evaluation and Optimization Models
- Fuzzy Logic and Control Systems
Xi'an Jiaotong University
2009-2018
Wuhan University
2016
State Key Laboratory of Software Engineering
2016
The University of Texas at Dallas
2013
Northwest University
2009
Xi'an University of Science and Technology
2007
École Centrale de Lyon
2003
Feature selection involves identifying a subset of the most useful features that produces compatible results as original entire set features. A feature algorithm may be evaluated from both efficiency and effectiveness points view. While concerns time required to find features, is related quality Based on these criteria, fast clustering-based (FAST) proposed experimentally in this paper. The FAST works two steps. In first step, are divided into clusters by using graph-theoretic clustering...
The investigation of community structure in networks has aroused great interest multiple disciplines. One the challenges is to find local communities from a starting vertex network without global information about entire network. Many existing methods tend be accurate depending on priori assumptions properties and predefined parameters. In this paper, we introduce new quality function present fast expansion algorithm for uncovering large-scale networks. proposed can detect multiresolution...
Clustering is one of the research hotspots in field data mining and has extensive applications practice. Recently, Rodriguez Laio [1] published a clustering algorithm on Science that identifies centers an intuitive way clusters objects efficiently effectively. However, sensitive to preassigned parameter suffers from identification "ideal" number clusters. To overcome these shortages, this paper proposes new can detect automatically via statistical testing. Specifically, proposed first...
Clustering is an important technique for mining the intrinsic community structures in networks. The density-based network clustering method able to not only detect communities of arbitrary size and shape, but also identify hubs outliers. However, it requires manual parameter specification define clusters, sensitive density threshold which difficult determine. Furthermore, many real-world networks exhibit a hierarchical structure with embedded within other communities. Therefore, result...
Many feature subset selection (FSS) algorithms have been proposed, but not all of them are appropriate for a given problem. At the same time, so far there is rarely good way to choose FSS problem at hand. Thus, algorithm automatic recommendation very important and practically useful. In this paper, meta learning based method presented. The proposed first identifies data sets that most similar one hand by k-nearest neighbor classification algorithm, distances among these calculated on...
Abstract Unsupervised feature selection is an important problem, especially for high‐dimensional data. However, until now, it has been scarcely studied and the existing algorithms cannot provide satisfying performance. Thus, in this paper, we propose a new unsupervised algorithm using similarity‐based clustering, Feature Selection‐based Clustering (FSFC). FSFC removes redundant features according to results of clustering based on similarity. First, clusters their A proposed, which overcomes...
Concept drift in data stream poses many challenges and difficulties mining this tradition-distinct database. In paper, we focus on detecting concept evolving stream. We propose a novel method to detect using entrop
Concept drifts usually originate from many causes instead of only one, which result in two types concept drifts: abrupt and gradual drifts. From the point view speed, pose strong challenges for data stream mining. In this paper, we propose a selective detector ensemble to detect both We first present our construction method, then introduce how use with proposed early-find-early-report rule. To evaluate performance compare it four drift detection methods on eight publicly available sets...
Intelligent data analysis techniques are useful for better exploring real-world sets. However, the sets always accompanied by missing that is one major factor affecting quality. At same time, good intelligent exploration requires quality data. Fortunately, Missing Data Imputation Techniques (MDITs) can be used to improve no method MDIT in all conditions, each has its own context. In this paper, we introduce MDITs KDD and machine learning communities presenting basic idea highlighting...
The problem of mobile sequential recommendation is to suggest a route connecting set pick-up points for taxi driver so that he/she more likely get passengers with less travel cost. Essentially, key challenge this its high computational complexity. In paper, we propose novel dynamic programming based method solve the consisting two separate stages: an offline pre-processing stage and online search stage. pre-computes potential candidate sequences from points. A backward incremental sequence...
Community detection is an important methodology for understanding the intrinsic structure and function of complex networks. Because overlapping community one characteristics real‐world networks should be considered detection, in this article, we propose algorithm, called link‐based label propagation algorithm (LinkLPA), to detect communities. link partition conceptually natural problem LinkLPA first transforms node into employs a new with preference on links instead nodes communities due...
Community detection is an important methodology for understanding the intrinsic structure and function of a real-world network. In this paper, we propose effective efficient algorithm, called Dominant Label Propagation Algorithm (Abbreviated as DLPA), to detect communities in complex networks. The algorithm simulates special voting process overlapping non-overlapping community networks simultaneously. Our very efficient, since its computational complexity almost linear number edges...
Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing kernels choosing suitable parameter values for specific function, but less selection. Furthermore, most of current selection methods focus seeking best the highest classification accuracy via cross-validation, they are time consuming ignore differences among number support vectors CPU SVM different kernels. Considering...
As more and classification algorithms continue to be developed, recommending appropriate a given problem is increasingly important. This article first distinguishes the algorithm recommendation methods by two dimensions: (1) meta-features, which are set of measures used characterize learning problems, (2) meta-target, represents relative performance on problem. In contrast existing whose meta-target usually in form either ranking candidate or single algorithm, this proposes new natural...
k-Nearest Neighbor (k-NN) is one of the most widely used classification algorithms. When classifying a new instance, k-NN first finds out its k nearest neighbors, and then classifies it by voting for categories neighbors. Therefo
AbstractWe present a method of using local linear smoothing to construct simultaneous confidence bands for the mean function densely spaced functional data. Our approach works well under mild conditions. In addition, estimator and its accompanying band enjoy semiparametric efficiency in sense that they are asymptotically equivalent counterparts obtained from random trajectories entirely observed without errors. We illustrate performance proposed through simulation study. Furthermore, an...