Muchao Ye

ORCID: 0009-0006-9112-8895
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Machine Learning in Healthcare
  • Adversarial Robustness in Machine Learning
  • Multimodal Machine Learning Applications
  • Artificial Intelligence in Healthcare
  • Natural Language Processing Techniques
  • Traffic Prediction and Management Techniques
  • Domain Adaptation and Few-Shot Learning
  • COVID-19 diagnosis using AI
  • Privacy-Preserving Technologies in Data
  • Anomaly Detection Techniques and Applications
  • Internet Traffic Analysis and Secure E-voting
  • Interpreting and Communication in Healthcare
  • Text Readability and Simplification
  • Security and Verification in Computing
  • Physical Unclonable Functions (PUFs) and Hardware Security
  • Cardiac Arrest and Resuscitation
  • Human Pose and Action Recognition
  • Advanced Malware Detection Techniques
  • Digital Media Forensic Detection
  • Biomedical Text Mining and Ontologies
  • Video Analysis and Summarization

Pennsylvania State University
2020-2024

Amazon (United States)
2024

UC San Diego Health System
2021

Microsoft (Germany)
2021

University of Electronic Science and Technology of China
2020

Purdue University West Lafayette
2020

Georgia State University
2020

Deep learning methods especially recurrent neural network based models have demonstrated early success in disease risk prediction on longitudinal patient data. Existing works follow a strong assumption to implicitly assume the stationary progression during each time period, and thus, take homogeneous way decay information from previous steps for all patients. However,in reality, is non-stationary. Besides, key target vary among To leverage more reasonable way, we propose new hierarchical...

10.1145/3394486.3403107 article EN 2020-08-20

The broad adoption of electronic health records (EHR) data and the availability biomedical knowledge graphs (KGs) on web have provided clinicians researchers unprecedented resources opportunities for conducting risk predictions to improve healthcare quality medical resource allocation. Existing methods focused improving EHR feature representations using attention mechanisms, time-aware models, or external knowledge. However, they ignore importance personalized information make predictions....

10.1145/3442381.3449860 article EN 2021-04-19

This paper focuses on a newly challenging setting in hard-label adversarial attacks text data by taking the budget information into account. Although existing approaches can successfully generate examples setting, they follow an ideal assumption that victim model does not restrict number of queries. However, real-world applications query is usually tight or limited. Moreover, attack techniques use genetic algorithm to optimize discrete maintaining candidates during optimization, which lead...

10.1609/aaai.v36i4.20303 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Risk prediction using electronic health records (EHR) is a challenging data mining task due to the two-level hierarchical structure of EHR data. consist set time-ordered visits, and within each visit, there unordered diagnosis codes. Existing approaches focus on modeling temporal visits with deep neural network (DNN) techniques. However, they ignore importance codes lot task-unrelated information usually leads unsatisfactory performance existing approaches. To minimize effect caused by noise...

10.1145/3340531.3411864 article EN 2020-10-19

The development of electronic health records (EHR) systems has enabled the collection a vast amount digitized patient data. However, utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics. With advancements in machine learning techniques, deep demonstrated superiority various applications, including healthcare. This survey systematically reviews recent advances learning-based models using Specifically, we introduce background and provide...

10.24963/ijcai.2024/914 article EN 2024-07-26

Xingyi Yang, Muchao Ye, Quanzeng You, Fenglong Ma. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.387 article EN cc-by 2021-01-01

Generating text adversarial examples in the hard-label setting is a more realistic and challenging black-box attack problem, whose challenge comes from fact that gradient cannot be directly calculated discrete word replacements. Consequently, effectiveness of gradient-based methods for this problem still awaits improvement. In paper, we propose optimization method named LeapAttack to craft high-quality setting. To specify, employs embedding space characterize semantic deviation between two...

10.1145/3534678.3539357 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

The broad adoption of electronic health record (EHR) systems and the advances deep learning technology have motivated development risk prediction models, which mainly depend on expressiveness temporal modeling capacity neural networks (DNNs) to improve performance. Some further augment by using external knowledge, however, a great deal EHR information inevitably loses during knowledge mapping. In addition, made existing models usually lacks reliable interpretation, undermines their...

10.1145/3459637.3482273 article EN 2021-10-26

Despite a plethora of prior explorations, conducting text adversarial attacks in practical settings is still challenging with the following constraints: black box -- inner structure victim model unknown; hard label attacker only has access to top-1 prediction results; and semantic preservation - perturbation needs preserve original semantics. In this paper, we present PAT, novel attack method employed under all these constraints. Specifically, PAT explicitly models non-adversarial prototypes...

10.1145/3580305.3599461 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

Thanks to the explosion of heterogeneous healthcare data and advanced machine learning mining techniques, specifically deep methods, we now have an opportunity make difference in healthcare. In this tutorial, will present state-of-the-art methods their real-world applications, focusing on exploring unique characteristics different types data. The first half be spent introducing recent advances structured data, including computational phenotyping, disease early detection/risk prediction...

10.1145/3447548.3470789 article EN 2021-08-13

Federated learning (FL) has emerged as an effective technique to co-training machine models without actually sharing data and leaking privacy. However, most existing FL methods focus on the supervised setting ignore utilization of unlabeled data. Although there are a few studies trying incorporate into FL, they all fail maintain performance guarantees or generalization ability in various real-world settings. In this paper, we designing general framework FedSiam tackle different scenarios...

10.48550/arxiv.2012.03292 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Audio video scene-aware dialog (AVSD) is a new but more challenging visual question answering (VQA) task because of the higher complexity feature extraction and fusion brought by additional modalities. Although recent methods have achieved early success in improving technique for AVSD, still needs further investigation. In this paper, inspired self-attention mechanism importance understanding questions VQA answering, we propose question-guided self-attentive multi-modal network (QUALIFIER)...

10.1109/wacv51458.2022.00256 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022-01-01

Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the “pre-training & finetuning” learning paradigm significantly improves VQA performance, adversarial robustness of such has not been explored. In this paper, we delve into new problem: using pre-trained multimodal source model to create image-text pairs then transferring them attack target models. Correspondingly, propose novel VQATTACK model, which can iteratively...

10.1609/aaai.v38i7.28499 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Vision-Language (VL) pre-trained models have shown their superiority on many multimodal tasks. However, the adversarial robustness of such has not been fully explored. Existing approaches mainly focus exploring under white-box setting, which is unrealistic. In this paper, we aim to investigate a new yet practical task craft image and text perturbations using VL attack black-box fine-tuned different downstream Towards end, propose VLATTACK generate samples by fusing images texts from both...

10.48550/arxiv.2310.04655 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Researchers have conduct adversarial attacks against deep neural networks (DNNs) for health risk prediction in the white/gray-box setting to evaluate their robustness. However, since most real-world solutions are trained by private data and released as black-box services on cloud, we should investigate robustness setting. Unfortunately, existing work ignores consider uniqueness of electronic records (EHRs). To fill this gap, propose first attack method models named MedAttacker vulnerability....

10.1109/bibm55620.2022.9994898 article EN 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2022-12-06

Health risk prediction is a challenge task that aims to predict whether patients would suffer from certain disease/condition in the near future based on their historical EHR data. Although existing approaches can achieve better performance, none of them deal with noise data explicitly. In this paper, we hypothesize automatically removing should help models further improve performance. Correspondingly, propose novel model named MedSkim, which able rule out irrelevant visits and codes by...

10.1109/icdm54844.2022.00018 article EN 2021 IEEE International Conference on Data Mining (ICDM) 2022-11-01

10.1109/cvprw63382.2024.00299 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2024-06-17

The rapid advancement of vision-language models (VLMs) has established a new paradigm in video anomaly detection (VAD): leveraging VLMs to simultaneously detect anomalies and provide comprehendible explanations for the decisions. Existing work this direction often assumes complex reasoning required VAD exceeds capabilities pretrained VLMs. Consequently, these approaches either incorporate specialized modules during inference or rely on instruction tuning datasets through additional training...

10.48550/arxiv.2412.01095 preprint EN arXiv (Cornell University) 2024-12-01

The development of electronic health records (EHR) systems has enabled the collection a vast amount digitized patient data. However, utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics. With advancements in machine learning techniques, deep demonstrated superiority various applications, including healthcare. This survey systematically reviews recent advances learning-based models using Specifically, we begin by introducing background and...

10.48550/arxiv.2402.01077 preprint EN arXiv (Cornell University) 2024-02-01

Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the ``pre-training & finetuning'' learning paradigm significantly improves VQA performance, adversarial robustness of such has not been explored. In this paper, we delve into new problem: using pre-trained multimodal source model to create image-text pairs then transferring them attack target models. Correspondingly, propose novel VQAttack model, which can iteratively...

10.48550/arxiv.2402.11083 preprint EN arXiv (Cornell University) 2024-02-16

Patients with low health literacy usually have difficulty understanding medical jargon and the complex structure of professional language. Although some studies are proposed to automatically translate expert language into layperson-understandable language, only a few them focus on both accuracy readability aspects simultaneously in clinical domain. Thus, simplification is still challenging task, but unfortunately, it not yet fully addressed previous work. To benchmark this we construct new...

10.48550/arxiv.2012.02420 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Medical report generation is one of the most challenging tasks in medical image analysis. Although existing approaches have achieved promising results, they either require a predefined template database order to retrieve sentences or ignore hierarchical nature generation. To address these issues, we propose MedWriter that incorporates novel retrieval mechanism automatically extract both and sentence-level templates for clinically accurate first employs Visual-Language Retrieval~(VLR) module...

10.48550/arxiv.2106.06471 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Deep neural networks (DNNs) have been broadly adopted in health risk prediction to provide healthcare diagnoses and treatments. To evaluate their robustness, existing research conducts adversarial attacks the white/gray-box setting where model parameters are accessible. However, a more realistic black-box attack is ignored even though most real-world models trained with private data released as services on cloud. fill this gap, we propose first method against named MedAttacker investigate...

10.48550/arxiv.2112.06063 preprint EN cc-by arXiv (Cornell University) 2021-01-01
Coming Soon ...