- Time Series Analysis and Forecasting
- Anomaly Detection Techniques and Applications
- Data Stream Mining Techniques
- Traffic Prediction and Management Techniques
- Data Management and Algorithms
- Automated Road and Building Extraction
- Data Visualization and Analytics
- Natural Language Processing Techniques
- Domain Adaptation and Few-Shot Learning
- Currency Recognition and Detection
- Multimodal Machine Learning Applications
- Topic Modeling
- Stock Market Forecasting Methods
- Network Security and Intrusion Detection
- Imbalanced Data Classification Techniques
- Machine Learning and Algorithms
- Forecasting Techniques and Applications
- Fault Detection and Control Systems
- Mobile Crowdsensing and Crowdsourcing
- Data Mining Algorithms and Applications
- Semantic Web and Ontologies
- Explainable Artificial Intelligence (XAI)
- Machine Learning and ELM
- Sparse and Compressive Sensing Techniques
- Transportation Planning and Optimization
Aalborg University
2018-2024
RMIT Vietnam
2023-2024
Vietnam National University Ho Chi Minh City
2017-2020
Ho Chi Minh City University of Science
2017
We propose two solutions to outlier detection in time series based on recurrent autoencoder ensembles. The exploit autoencoders built using sparsely-connected neural networks (S-RNNs). Such make it possible generate multiple with different network connection structures. are ensemble frameworks, specifically an independent framework and a shared framework, both of which combine S-RNN enable detection. This ensemble-based approach aims reduce the effects some being overfitted outliers, this...
Due to the continued digitization of industrial and societal processes, including deployment networked sensors, we are witnessing a rapid proliferation time-ordered observations, known as time series. For example, behavior drivers can be captured by GPS or accelerometer series speeds, directions, accelerations. We propose framework for outlier detection in that, used identifying dangerous driving hazardous road locations. Specifically, first method that generates statistical features enrich...
A variety of real-world applications rely on far future information to make decisions, thus calling for efficient and accurate long sequence multivariate time series forecasting. While recent attention-based forecasting models show strong abilities in capturing long-term dependencies, they still suffer from two key limitations. First, canonical self attention has a quadratic complexity w.r.t. the input length, falling short efficiency. Second, different variables’ often have distinct...
Traffic time series forecasting is challenging due to complex spatio-temporal dynamics-time from different locations often have distinct patterns; and for the same series, patterns may vary across time, where, example, there exist certain periods a day showing stronger temporal correlations. Although recent models, in particular deep learning based show promising results, they suf-fer being agnostic. Such agnostic models employ shared parameter space irrespective of assume that are similar...
Time series data occurs widely, and outlier detection is a fundamental problem in mining, which has numerous applications. Existing autoencoder-based approaches deliver state-of-the-art performance on challenging real-world but are vulnerable to outliers exhibit low explainability. To address these two limitations, we propose robust explainable unsupervised auto encoder frameworks that decompose an input time into clean using autoencoders. Improved explainability achieved because better...
Electric energy consumption forecasting is an interesting, challenging, and important issue in management equipment efficiency improvement. Existing approaches are predictive models that have the ability to predict for a specific profile, i.e., time series of whole building or individual household smart building. In practice, there many profiles each building, which leads time-consuming expensive system resources. Therefore, this study develops robust framework Multiple Energy Consumption...
Correlated time series forecasting plays an essential role in many cyber-physical systems, where entities interact with each other over time. To enable accurate forecasting, it is to capture both the temporal dynamics and correlations among different entities. former, two popular types of models, recurrent neural networks (RNNs) convolution (TCNs), are employed. latter, a graph constructed reflect certain relationships then (GC) applied upon The state-of-the-art accuracy achieved by models...
We propose variational quasi-recurrent autoencoders (VQRAEs) to enable robust and efficient anomaly detection in time series unsupervised settings. The proposed VQRAEs employs a judiciously designed objective function based on divergences, including a, ß, and, -divergence, making it possible separate anomalies from normal data without the reliance labels, thus achieving robustness fully training. To better capture temporal dependencies data, are built upon neural networks, which employ...
Due to the sweeping digitalization of processes, increasingly vast amounts time series data are being produced. Accurate classification such facilitates decision making in multiple domains. State-of-the-art accuracy is often achieved by ensemble learning where results synthesized from base models. This characteristic implies that needs substantial computing resources, preventing their use resource-limited environments, as edge devices. To extend applicability learning, we propose LightTS...
We consider a scenario that occurs often in the auto insurance industry. are given large collection of trajectories stem from many different drivers. Only small number labeled with driver identifiers, and only some drivers used labels. The problem is to label correctly unlabeled identifiers. This important detect possible fraud identify in, e.g., pay-as-you-drive settings when vehicle has been involved an incident. To solve problem, we first propose Trajectory-to-Image( T2I) encoding scheme...
With the sweeping digitalization of societal, medical, industrial, and scientific processes, sensing technologies are being deployed that produce increasing volumes time series data, thus fueling a plethora new or improved applications. In this setting, outlier detection is frequently important, while solutions based on neural networks exist, they leave room for improvement in terms both accuracy efficiency. objective achieving such improvements, we propose diversity-driven, convolutional...
Crowdsourcing aims to enable the assignment of available resources completion tasks at scale. The continued digitization societal processes translates into increased opportunities for crowdsourcing. For example, crowdsourcing enables computational humans, called workers, that are notoriously hard computers. In settings faced with malicious actors, detection such actors holds potential increase robustness platform. We propose a framework Outlier Detection Streaming Task Assignment improve by...
We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able deploy machine learning models, e.g., for classification, edge devices near sensors such decisions can made instantaneously, rather than first having transmit incoming servers. To enable deployment with limited storage and computational capabilities, full-precision parameters in standard models quantized use fewer bits. The...
A variety of real-world applications rely on far future information to make decisions, thus calling for efficient and accurate long sequence multivariate time series forecasting. While recent attention-based forecasting models show strong abilities in capturing long-term dependencies, they still suffer from two key limitations. First, canonical self attention has a quadratic complexity w.r.t. the input length, falling short efficiency. Second, different variables' often have distinct...
The availability of massive vehicle trajectory data enables the modeling road-network constrained movement as travel-cost distributions rather than just single-valued costs, thereby capturing inherent uncertainty and enabling improved routing quality. Thus, stochastic has been studied extensively in edge-centric model, where such costs are assigned to edges a graph representation road network. However, this model still disregards important information trajectories fails capture dependencies...
The continued digitization of societal processes translates into a proliferation time series data that cover applications such as fraud detection, intrusion and energy management, where anomaly detection is often essential to enable reliability safety. Many recent studies target for data. Indeed, area characterized by diverse data, methods, evaluation strategies, comparisons in existing consider only part this diversity, which makes it difficult select the best method particular problem...
Text-video retrieval, a prominent sub-field within the domain of multimodal information has witnessed remarkable growth in recent years. However, existing methods assume video scenes are consistent with unbiased descriptions. These limitations fail to align real-world scenarios since descriptions can be influenced by annotator biases, diverse writing styles, and varying textual perspectives. To overcome aforementioned problems, we introduce WAVER, cross-domain knowledge distillation...
The availability of massive vehicle trajectory data enables the modeling road-network constrained movement as travel-cost distributions rather than just single-valued costs, thereby capturing inherent uncertainty and enabling improved routing quality. Thus, stochastic has been studied extensively in edge-centric model, where such costs are assigned to edges a graph representation road network. However, this model still disregards important information trajectories fails capture dependencies...
Due to the global trend towards urbanization, people increasingly move and live in cities that then continue grow. Traffic forecasting plays an important role intelligent transportation systems of as well spatio-temporal data mining. State-of-the-art is achieved by deep-learning approaches due their ability contend with complex dynamics. However, existing methods assume input fixed-topology road networks static traffic time series. These assumptions fail align where series are collected...
We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able deploy machine learning models edge devices near sensors such decisions can made instantaneously, rather than first having transmit incoming servers. To enable deployment with limited storage and computational capabilities, full-precision parameters in standard quantized use fewer bits. The resulting then calibrated using...
Due to the global trend towards urbanization, people increasingly move and live in cities that then continue grow. Traffic forecasting plays an important role intelligent transportation systems of as well spatio-temporal data mining. State-of-the-art is achieved by deep-learning approaches due their ability contend with complex dynamics. However, existing methods assume input fixed-topology road networks static traffic time series. These assumptions fail align where series are collected...
Due to the sweeping digitalization of processes, increasingly vast amounts time series data are being produced. Accurate classification such facilitates decision making in multiple domains. State-of-the-art accuracy is often achieved by ensemble learning where results synthesized from base models. This characteristic implies that needs substantial computing resources, preventing their use resource-limited environments, as edge devices. To extend applicability learning, we propose LightTS...
Traffic time series forecasting is challenging due to complex spatio-temporal dynamics from different locations often have distinct patterns; and for the same series, patterns may vary across time, where, example, there exist certain periods a day showing stronger temporal correlations. Although recent models, in particular deep learning based show promising results, they suffer being agnostic. Such agnostic models employ shared parameter space irrespective of assume that are similar do not...