Yi Liu

ORCID: 0000-0002-0811-6150
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Privacy-Preserving Technologies in Data
  • Topic Modeling
  • Natural Language Processing Techniques
  • Cryptography and Data Security
  • Traffic Prediction and Management Techniques
  • Speech Recognition and Synthesis
  • Blockchain Technology Applications and Security
  • Geological and Geochemical Analysis
  • Landslides and related hazards
  • Adversarial Robustness in Machine Learning
  • Geological and Geophysical Studies
  • Geochemistry and Geochronology of Asian Mineral Deposits
  • Advanced Computational Techniques and Applications
  • Caching and Content Delivery
  • IoT and Edge/Fog Computing
  • Internet Traffic Analysis and Secure E-voting
  • Hydrocarbon exploration and reservoir analysis
  • Advanced Algorithms and Applications
  • Web Data Mining and Analysis
  • Network Security and Intrusion Detection
  • Geochemistry and Geologic Mapping
  • Simulation and Modeling Applications
  • Mobile Crowdsensing and Crowdsourcing
  • Synthetic Aperture Radar (SAR) Applications and Techniques
  • Parallel Computing and Optimization Techniques

University of Alberta
2019-2025

Beijing Institute of Fashion Technology
2014-2025

China Electric Power Research Institute
2021-2024

First Automotive Works (China)
2024

City University of Hong Kong
2021-2024

North China Electric Power University
2005-2024

National University of Defense Technology
2022-2024

Wuhan Polytechnic University
2024

Yan'an University
2008-2024

Beijing University of Posts and Telecommunications
2013-2023

Since edge device failures (i.e., anomalies) seriously affect the production of industrial products in Industrial IoT (IIoT), accurately and timely detecting anomalies is becoming increasingly important. Furthermore, data collected by may contain user's private data, which challenging current detection approaches as user privacy calling for public concern recent years. With this focus, paper proposes a new communication-efficient on-device federated learning (FL)-based deep anomaly framework...

10.1109/jiot.2020.3011726 article EN IEEE Internet of Things Journal 2020-07-24

As the 5G communication networks are being widely deployed worldwide, both industry and academia have started to move beyond explore 6G communications. It is generally believed that will be established on ubiquitous Artificial Intelligence (AI) achieve data-driven Machine Learning (ML) solutions in heterogeneous massive-scale networks. However, traditional ML techniques require centralized data collection processing by a central server, which becoming bottleneck of large-scale implementation...

10.23919/jcc.2020.09.009 article EN China Communications 2020-09-01

Traffic speed prediction, as one of the most important topics in Intelligent Transport Systems (ITS), has been investigated thoroughly literature. Nonetheless, traditional methods show their limitation coping with complexity and high nonlinearity traffic data well learning spatial-temporal dependencies. Particularly, they often neglect dynamics happening to network. Attention-based models witnessed extensive developments recent years have shown its efficacy a host fields, which inspires us...

10.1109/access.2019.2953888 article EN cc-by IEEE Access 2019-01-01

Federated learning has recently emerged as a paradigm promising the benefits of harnessing rich data from diverse sources to train high quality models, with salient features that training datasets never leave local devices. Only model updates are locally computed and shared for aggregation produce global model. While federated greatly alleviates privacy concerns opposed centralized data, sharing still poses risks. In this paper, we present system design which offers efficient protection...

10.1109/tdsc.2022.3146448 article EN IEEE Transactions on Dependable and Secure Computing 2022-01-27

Conventional machine learning approaches aggregate all training data in a central server, which causes massive communication overhead of transmission and is also vulnerable to privacy leakage. Thereby, blockchain-based federated has emerged protect Artificial Intelligence Things (AIoT) devices from exposing their private by the Federated Learning (FL) framework, enables decentralized model without vulnerability server. However, existing FL systems still suffer (i) limited scalability single...

10.1109/tnse.2022.3178970 article EN IEEE Transactions on Network Science and Engineering 2022-05-30

In Machine Learning, the emergence of \textit{the right to be forgotten} gave birth a paradigm named \textit{machine unlearning}, which enables data holders proactively erase their from trained model. Existing machine unlearning techniques focus on centralized training, where access all holders' training is must for server conduct process. It remains largely underexplored about how achieve when full becomes unavailable. One noteworthy example Federated Learning (FL), each participating...

10.1109/infocom48880.2022.9796721 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2022-05-02

Large Language Models (LLMs), like ChatGPT, have demonstrated vast potential but also introduce challenges related to content constraints and misuse. Our study investigates three key research questions: (1) the number of different prompt types that can jailbreak LLMs, (2) effectiveness prompts in circumventing LLM constraints, (3) resilience ChatGPT against these prompts. Initially, we develop a classification model analyze distribution existing prompts, identifying ten distinct patterns...

10.48550/arxiv.2305.13860 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Existing traffic flow forecasting approaches by deep learning models achieve excellent success based on a large volume of datasets gathered governments and organizations. However, these may contain lots user's private data, which is challenging the current prediction as user privacy calling for public concern in recent years. Therefore, how to develop accurate while preserving significant problem be solved, there trade-off between two objectives. To address this challenge, we introduce...

10.1109/jiot.2020.2991401 article EN IEEE Internet of Things Journal 2020-04-30

GPS trajectories serve as a significant data source for travel mode identification along with the development of various GPS-enabled smart devices. However, such directly integrate user private information, thus hindering users from sharing third parties. On other hand, existing methods heavily depend on respective manual annotations, whose production is economically inefficient and error-prone. In this paper, we propose Semi-supervised Federated Learning (SSFL) framework that can accurately...

10.1109/tits.2021.3092015 article EN IEEE Transactions on Intelligent Transportation Systems 2021-08-16

Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities implications prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis ten commercial applications, highlighting constraints current attack...

10.48550/arxiv.2306.05499 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Generative Adversarial Network (GAN) and its variants serve as a perfect representation of the data generation model, providing researchers with large amount high-quality generated data. They illustrate promising direction for research limited availability. When GAN learns semantic-rich distribution from dataset, density tends to concentrate on training Due gradient parameters deep neural network contain samples, they can easily remember samples. is applied private or sensitive data,...

10.1109/icpads47876.2019.00150 preprint EN 2019-12-01

In the explosive growth of time-series data (TSD), scale TSD suggests that and capability many Internet Things (IoT)-based applications has already been exceeded. Moreover, redundancy persists in due to correlation between information acquired via different sources. this article, we propose a cohort dominant set selection algorithms for electricity consumption with focus on discriminating is small but capable representing kernel carried by an arbitrarily error rate less than <inline-formula...

10.1109/jiot.2019.2946753 article EN IEEE Internet of Things Journal 2019-10-10

The rapid development of the Internet Things (IoT) accumulates a large number communication records, which are utilized for anomaly detection in IoT communication. However, only small part these records can be labeled, increases difficulty detection. This article proposes semisupervised hierarchical stacking temporal convolutional network (HS-TCN), is first model communication, and it train unlabeled data based on labeled data. Furthermore, HS-TCN fully considers features streaming weed out...

10.1109/jiot.2020.3000771 article EN IEEE Internet of Things Journal 2020-06-09

Federated learning (FL) has recently been proposed as an emerging paradigm to build machine models using distributed training datasets that are locally stored and maintained on different devices in 5G networks while providing privacy preservation for participants. In FL, the central aggregator accumulates local updates uploaded by participants update a global model. However, there two critical security threats: poisoning membership inference attacks. These attacks may be carried out...

10.1109/mwc.01.1900525 article EN IEEE Wireless Communications 2020-08-01

Due to air quality significantly affects human health, it is becoming increasingly important accurately and timely predict the Air Quality Index (AQI). To this end, paper proposes a new federated learning-based aerial-ground sensing framework for fine-grained 3D monitoring forecasting. Specifically, in air, leverages light-weight Dense-MobileNet model achieve energy-efficient end-to-end learning from haze features of images taken by Unmanned Aerial Vehicles (UAVs) predicting AQI scale...

10.1109/jiot.2020.3021006 article EN IEEE Internet of Things Journal 2020-09-01

Real-world integrated personalized recommendation systems usually deal with millions of heterogeneous items. It is extremely challenging to conduct full corpus retrieval complicated models due the tremendous computation costs. Hence, most large-scale consist two modules: a multi-channel matching module efficiently retrieve small subset candidates, and ranking for precise recommendation. However, suffers from cold-start problems when adding new channels or data sources. To solve this issue,...

10.24963/ijcai.2020/379 article EN 2020-07-01

Long queries often suffer from low recall in Web search due to conjunctive term matching. The chances of matching words relevant documents can be increased by rewriting query terms into new with similar statistical properties. We present a comparison approaches that deploy user logs learn rewrites the document space. show best results are achieved adopting perspective bridging “lexical chasm” between and translating source language target documents. train state-of-the-art machine translation...

10.1162/coli_a_00010 article EN Computational Linguistics 2010-07-27

Federated Edge Learning (FEL) allows edge nodes to train a global deep learning model collaboratively for computing in the Industrial Internet of Things (IIoT), which significantly promotes development 4.0. However, FEL faces two critical challenges: communication overhead and data privacy. suffers from expensive when training large-scale multi-node models. Furthermore, due vulnerability gradient leakage label-flipping attacks, process is easily compromised by adversaries. To address these...

10.1145/3453169 article EN ACM Transactions on Internet Technology 2021-12-06

Large Language Models (LLMs) have become increasingly popular for their advanced text generation capabilities across various domains. However, like any software, they face security challenges, including the risk of 'jailbreak' attacks that manipulate LLMs to produce prohibited content. A particularly underexplored area is Multilingual Jailbreak attack, where malicious questions are translated into languages evade safety filters. Currently, there a lack comprehensive empirical studies...

10.48550/arxiv.2401.16765 preprint EN arXiv (Cornell University) 2024-01-30
Coming Soon ...