- Mobile Crowdsensing and Crowdsourcing
- Anomaly Detection Techniques and Applications
- Adversarial Robustness in Machine Learning
- Human Mobility and Location-Based Analysis
- Misinformation and Its Impacts
- Topic Modeling
- Privacy-Preserving Technologies in Data
- Hate Speech and Cyberbullying Detection
- Advanced Malware Detection Techniques
- Complex Network Analysis Techniques
- Traffic Prediction and Management Techniques
- Indoor and Outdoor Localization Technologies
- Domain Adaptation and Few-Shot Learning
- Network Security and Intrusion Detection
- Data Stream Mining Techniques
- Advanced Graph Neural Networks
- Advanced Computational Techniques and Applications
- Explainable Artificial Intelligence (XAI)
- COVID-19 diagnosis using AI
- Geographic Information Systems Studies
- Context-Aware Activity Recognition Systems
- Video Surveillance and Tracking Methods
- Spam and Phishing Detection
- Water Systems and Optimization
- Data-Driven Disease Surveillance
University of Illinois Urbana-Champaign
2022-2025
Boston University
2024
Bohai University
2024
Shaoyang University
2024
Chongqing Medical University
2024
China National Petroleum Corporation (China)
2024
Helmholtz Center for Information Security
2018-2023
University of Notre Dame
2017-2023
Beijing University of Chemical Technology
2023
North China University of Technology
2022-2023
Machine learning (ML) has been widely adopted in various privacy-critical applications, e.g., face recognition and medical image analysis. However, recent research shown that ML models are vulnerable to attacks against their training data. Membership inference is one major attack this domain: Given a data sample model, an adversary aims determine whether the part of model's set. Existing membership leverage confidence scores returned by model as inputs (score-based attacks). these can be...
The promise of smart environments and the Internet Things (IoT) relies on robust sensing diverse environmental facets. Traditional approaches rely direct distributed sensing, most often by measuring one particular aspect an environment with a special purpose sensor. This approach can be costly to deploy, hard maintain, aesthetically socially obtrusive. In this work, we explore notion general wherein single enhanced sensor indirectly monitor large context, without instrumentation objects....
The development of positioning technologies has resulted in an increasing amount mobility data being available. While bringing a lot convenience to people's life, such availability also raises serious concerns about privacy. In this paper, we concentrate on one the most sensitive information that can be inferred from data, namely social relationships. We propose novel relation inference attack relies advanced feature learning technique automatically summarize users' features. Compared...
Machine learning (ML) has progressed rapidly during the past decade and major factor that drives such development is unprecedented large-scale data. As data generation a continuous process, this leads to ML model owners updating their models frequently with newly-collected in an online scenario. In consequence, if queried same set of samples at two different points time, it will provide results. paper, we investigate whether change output black-box before after being updated can leak...
Deep learning has achieved overwhelming success, spanning from discriminative models to generative models. In particular, deep have facilitated a new level of performance in myriad areas, ranging media manipulation sanitized dataset generation. Despite the great potential risks privacy breach caused by not been analyzed systematically. this paper, we focus on membership inference attack against that reveals information about training data used for victim Specifically, present first taxonomy...
Identifying trustworthy information in the presence of noisy data contributed by numerous unvetted sources from online social media (e.g., Twitter, Facebook, and Instagram) has been a crucial task era big data. This task, referred to as truth discovery, targets at identifying reliability truthfulness claims they make without knowing either priori. In this work, we identified three important challenges that have not well addressed current discovery literature. The first one is “misinformation...
The misuse of large language models (LLMs) has garnered significant attention from the general public and LLM vendors. In response, efforts have been made to align LLMs with human values intent use. However, a particular type adversarial prompts, known as jailbreak prompt, emerged continuously evolved bypass safeguards elicit harmful content LLMs. this paper, we conduct first measurement study on prompts in wild, 6,387 collected four platforms over six months. Leveraging natural processing...
Outlier detection in wireless sensor networks is essential to ensure data quality, secure monitoring and reliable of interesting critical events. A key challenge for outlier adaptively identify outliers an online manner with a high accuracy while maintaining the resource consumption network minimum. In this paper, we propose one-class support vector machine-based techniques that sequentially update model representing normal behavior sensed take advantage spatial temporal correlations exist...
Artificial Intelligence (AI) has been widely adopted in many important application domains such as speech recognition, computer vision, autonomous driving, and AI for social good. In this paper, we focus on the AI-based damage assessment applications where deep neural network approaches are used to automatically identify severity of impacted areas from imagery reports aftermath a disaster (e.g., earthquake, hurricane, landslides). While algorithms often significantly reduce detection time...
This paper studies an emerging and important problem of identifying misleading COVID-19 short videos where the content is jointly expressed in visual, audio, textual videos. Existing solutions for video detection mainly focus on authenticity or audios against AI algorithms (e.g., deepfake) manipulation, are insufficient to address our most user-generated intentionally edited. Two critical challenges exist solving problem: i) how effectively extract information from distractive manipulated...
Many real-world data come in the form of graphs. Graph neural networks (GNNs), a new family machine learning (ML) models, have been proposed to fully leverage graph build powerful applications. In particular, inductive GNNs, which can generalize unseen data, become mainstream this direction. Machine models shown great potential various tasks and deployed many scenarios. To train good model, large amount as well computational resources are needed, leading valuable intellectual property....
The proliferation of social media has promoted the spread misinformation that raises many concerns in our society. This paper focuses on a critical problem explainable COVID-19 detection aims to accurately identify and explain misleading claims media. Motivated by lack relevant knowledge existing solutions, we construct novel crowdsource graph based approach incorporate facts leveraging collaborative efforts expert non-expert crowd workers. Two important challenges exist developing solution:...
This paper focuses on a critical problem of explainable multimodal COVID-19 misinformation detection where the goal is to accurately detect misleading information in news articles and provide reason or evidence that can explain results. Our work motivated by lack judicious study association between different modalities (e.g., text image) content current solutions. In this paper, we present generative approach investigating cross-modal visual textual deeply embedded content. Two challenges...
With the ever-increasing number of road traffic accidents worldwide, safety has become a critical problem in intelligent transportation systems. A key step towards improving is to identify locations where severe happen with high probability so precautions can be applied effectively. We refer this as risky location identification. While previous efforts have been made address similar problems, two important limitations exist: i) data availability: many cities (especially developing countries)...
Social media sensing has emerged as a new big data application paradigm to collect observations and claims about the measured variables in physical environment from common citizens. A fundamental problem social applications lies estimating evolving truth of reliability sources without knowing either them priori, which is referred dynamic discovery. We identified two critical challenges that are not fully addressed by solutions current literature. The first challenge "physical...
Forecasting traffic accidents at a fine-grained spatial scale is essential to provide effective precautions and improve safety in smart urban sensing applications. Current solutions primarily rely on complete historical accident records and/or accurate real-time sensor data for risk prediction. These are prone various limitations (e.g., facility availability, privacy legal constraints). In this paper, we address those by exploring two types of widely available complementary sources: social...
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models.It encapsulates the knowledge from large dataset into smaller synthetic dataset.A model trained on this distilled can attain comparable performance original dataset.However, existing techniques mainly aim at achieving best trade-off between resource usage and utility.The security risks stemming them have not been explored.This study performs first backdoor attack against...
We aimed to develop a machine-learning based predictive model identify 30-day readmission risk in Acute heart failure (AHF) patients. In this study 2232 patients hospitalized with AHF were included. The variance inflation factor value and 5-fold cross-validation used select vital clinical variables. Five machine learning algorithms good performance applied models, the discrimination ability was comprehensively evaluated by sensitivity, specificity, area under ROC curve (AUC). Prediction...