- Anomaly Detection Techniques and Applications
- Data Management and Algorithms
- Data-Driven Disease Surveillance
- Complex Network Analysis Techniques
- Human Mobility and Location-Based Analysis
- Traffic Prediction and Management Techniques
- Geographic Information Systems Studies
- Topic Modeling
- Advanced Statistical Methods and Models
- Advanced Database Systems and Queries
- Data Mining Algorithms and Applications
- Advanced Text Analysis Techniques
- Advanced Graph Neural Networks
- Network Security and Intrusion Detection
- Natural Language Processing Techniques
- Time Series Analysis and Forecasting
- Spam and Phishing Detection
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Misinformation and Its Impacts
- Music and Audio Processing
- Web Data Mining and Analysis
- Algorithms and Data Compression
- Sentiment Analysis and Opinion Mining
- Data Stream Mining Techniques
Virginia Tech
2016-2025
Institute for Forecasting of the Slovak Academy of Sciences
2025
China South Industries Group (China)
2024
Stevens Institute of Technology
2024
Boston Children's Museum
2024
Boston Children's Hospital
2024
Kuwait University
2023
Microsoft (United States)
2023
University of Virginia
2013-2014
University of Houston - Victoria
2014
Due to the dramatic increase of fraud which results in loss billions dollars worldwide each year, several modern techniques detecting are continually developed and applied many business fields. Fraud detection involves monitoring behavior populations users order estimate, detect, or avoid undesirable behavior. Undesirable is a broad term including delinquency, fraud, intrusion, account defaulting. This paper presents survey current used credit card detection, telecommunication computer...
With the advance of sensor technologies, Multivariate Time Series classification (MTSC) problem, perhaps one most essential problems in time series data mining domain, has continuously received a significant amount attention recent decades. Traditional approaches based on Bag-of-Patterns or Shapelet have difficulty dealing with huge amounts feature candidates generated high-dimensional multivariate but promising performance even when training set is small. In contrast, deep learning methods...
We describe the design, implementation, and evaluation of EMBERS, an automated, 24x7 continuous system for forecasting civil unrest across 10 countries Latin America using open source indicators such as tweets, news sources, blogs, economic indicators, other data sources. Unlike retrospective studies, EMBERS has been making forecasts into future since Nov 2012 which have (and continue to be) evaluated by independent T&E team (MITRE). Of note, successfully forecast June 2013 protests in...
A quantitative analysis of tweets during the Ebola crisis reveals that lies, half-truths, and rumors can spread just like true news.
Spatial event forecasting from social media is an important problem but encounters critical challenges, such as dynamic patterns of features (keywords) and geographic heterogeneity (e.g., spatial correlations, imbalanced samples, different populations in locations). Most existing approaches LASSO regression, query expansion, burst detection) are designed to address some these not all them. This paper proposes a novel multi-task learning framework which aims concurrently the challenges....
Identification of travelers' transportation modes is a fundamental step for various problems that arise in the domain such as travel demand analysis, transport planning, and traffic management. In this paper, we aim to identify purely based on their GPS trajectories. First, segmentation process developed partition user's trip into segments with only one mode. A majority studies have proposed mode inference models hand-crafted features, which might be vulnerable environmental conditions....
Social media is often viewed as a sensor into various societal events such disease outbreaks, protests, and elections. We describe the use of social crowdsourced to gain insight ongoing cyber-attacks. Our approach detects broad range cyber-attacks (e.g., distributed denial service (DDoS) attacks, data breaches, account hijacking) in weakly supervised manner using just small set seed event triggers requires no training or labeled samples. A new query expansion strategy based on convolution...
Amidst the COVID-19 pandemic, cyberbullying has become an even more serious threat. Our work aims to investigate viability of automatic multiclass detection model that is able classify whether a cyberbully targeting victim's age, ethnicity, gender, religion, or other quality. Previous literature not yet explored making fine-grained classifications o f s uch m agnitude, nd existing datasets suffer from quite severe class imbalances. To combat these challenges, we establish framework for...
Statement of problemAI technology presents a variety benefits and challenges for educators.PurposeTo investigate whether ChatGPT Bard are valuable resources generating multiple-choice questions educators dental caries.Material methodsA book on caries was used. Sixteen paragraphs were extracted by an expert consultant based applicability potential developing questions. language models used to produce this input, 64 generated. Three specialists assessed the relevance, accuracy, complexity...
Spatial databases, addressing the growing data management and analysis needs of spatial applications such as geographic information systems, have been an active area research for more than two decades. This has produced a taxonomy models space, types operators, query languages processing strategies, well indexes clustering techniques. However, is needed to improve support network field data, (e.g., cost models, bulk load). Another important need apply accomplishments newer applications,...
Identification of outliers can lead to the discovery unexpected, interesting, and useful knowledge. Existing methods are designed for detecting spatial in multidimensional geometric data sets, where a distance metric is available. In this paper, we focus on graph structured sets. We define statistical tests, analyze foundation underlying our approach, design several fast algorithms detect outliers, provide cost model outlier detection procedures. addition, experimental results from...
A spatial outlier is a spatially referenced object whose non-spatial attribute values are significantly different from the of its neighborhood. Identification outliers can lead to discovery unexpected, interesting, and useful patterns for further analysis. One drawback existing methods that normal objects tend be falsely detected as when their neighborhood contains true outliers. We propose suite detection algorithms overcome this disadvantage. formulate problem in general way design which...
Spatial outliers are the spatial objects with distinct features from their surrounding neighbors. Detection of helps reveal valuable information large data sets. In many real applications, can not be simply abstracted as isolated points. They have different boundary, size, volume, and location. These properties affect impact a object on its neighbors should taken into consideration. this paper, we propose two outlier detection methods which integrate to outlierness measurement. Experimental...
This study explores factors significantly impact the acceptance of Wireless Internet via Mobile Technology (WIMT) in China. The results indicate that WIMT is related with of: perceived usefulness, ease use, social influences, wireless trust environment, and facilitating conditions. It provides diagnostic insight into how different influence user intention to accept China, thus help business build solid strategy prompt m-commerce there.
Crowd sourcing is based on a simple but powerful concept: Virtually anyone has the potential to plug in valuable information. The concept revolves around large groups of people or community handling tasks that have traditionally been associated with specialist small group experts. With advent smart devices, many mobile applications are already tapping into crowd report issues and traffic problems, more can be done. While most these work well for average user, it neglects information needs...
Infectious disease epidemics such as influenza and Ebola pose a serious threat to global public health. It is crucial characterize the evolution of ongoing epidemic efficiently accurately. Computational epidemiology can model progress underlying contact network, but suffers from lack real-time fine-grained surveillance data. Social media, on other hand, provides timely detailed surveillance, insensible network model. This paper proposes novel semi-supervised deep learning framework that...
Traffic prediction is critical for the success of intelligent transportation systems (ITS). However, most spatio-temporal models suffer from high mathematical complexity and low tune-up flexibility. This article presents a novel random effects (STRE) model that has reduced computational due to dimension reduction, with additional flexibility provided by basis function capable taking traffic patterns into account. Bellevue, WA, was selected as test site its widespread deployment loop...
EMBERS is an anticipatory intelligence system forecasting population-level events in multiple countries of Latin America. A deployed from 2012, has been generating alerts 24x7 by ingesting a broad range data sources including news, blogs, tweets, machine coded events,currency rates, and food prices. In this paper, we describe our experiences operating continuously for nearly 4 years, with specific attention to the discoveries it enabled, correct as well missed forecasts, lessons learnt...
Event forecasting in Twitter is an important and challenging problem. Most existing approaches focus on temporal events (such as elections sports) do not consider spatial features their underlying correlations. In this paper, we propose a generative model for spatiotemporal event Twitter. Our characterizes the development of future by jointly modeling structural contexts burstiness. An effective inference algorithm developed to train parameters. Utilizing trained model, alignment likelihood...