Paul Pu Liang

ORCID: 0000-0001-7768-3610
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Multimodal Machine Learning Applications
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Speech Recognition and Synthesis
  • Sentiment Analysis and Opinion Mining
  • Music and Audio Processing
  • Emotion and Mood Recognition
  • Domain Adaptation and Few-Shot Learning
  • Speech and Audio Processing
  • Explainable Artificial Intelligence (XAI)
  • Privacy-Preserving Technologies in Data
  • Machine Learning and Algorithms
  • Mental Health via Writing
  • Neural Networks and Applications
  • Text Readability and Simplification
  • Human Pose and Action Recognition
  • Machine Learning in Healthcare
  • Machine Learning and Data Classification
  • Anomaly Detection Techniques and Applications
  • Face and Expression Recognition
  • Hate Speech and Cyberbullying Detection
  • Computational and Text Analysis Methods
  • Multi-Agent Systems and Negotiation
  • Online Learning and Analytics

Carnegie Mellon University
2017-2024

RIKEN Center for Advanced Intelligence Project
2023

Mongolia International University
2023

The University of Tokyo
2019

Machine Science
2018

East Stroudsburg University
2018

University of Notre Dame
2018

University of Edinburgh
2018

Tata Consultancy Services (India)
2018

Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J. Zico Kolter, Louis-Philippe Morency, Ruslan Salakhutdinov. Proceedings of the 57th Annual Meeting Association for Computational Linguistics. 2019.

10.18653/v1/p19-1656 article EN cc-by 2019-01-01

AmirAli Bagher Zadeh, Paul Pu Liang, Soujanya Poria, Erik Cambria, Louis-Philippe Morency. Proceedings of the 56th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2018.

10.18653/v1/p18-1208 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, AmirAli Bagher Zadeh, Louis-Philippe Morency. Proceedings of the 56th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2018.

10.18653/v1/p18-1209 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Multi-view sequential learning is a fundamental problem in machine dealing with multi-view sequences. In sequence, there exists two forms of interactions between different views: view-specific and cross-view interactions. this paper, we present new neural architecture for called the Memory Fusion Network (MFN) that explicitly accounts both continuously models them through time. The first component MFN System LSTMs, where are learned isolation assigning an LSTM function to each view. then...

10.1609/aaai.v32i1.12021 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-27

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic to convey our intentions. Humans easily process understand communication, however, comprehending this form of remains significant challenge for Artificial Intelligence (AI). AI must each modality the interactions between them that shape communication. In paper, we present novel neural architecture understanding human called...

10.1609/aaai.v32i1.12024 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-27

Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker often vary dynamically depending on different contexts, such as vocal patterns facial expressions. As a result, when modeling human language, it is essential to not only consider literal meaning words but also contexts in which these appear. To better model we first expressive representations by analyzing fine-grained visual acoustic that occur word segments. In...

10.1609/aaai.v33i01.33017216 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Multimodal sentiment analysis is a core research area that studies speaker expressed from the language, visual, and acoustic modalities. The central challenge in multimodal learning involves inferring joint representations can process relate information these However, existing work learns by requiring all modalities as input result, learned may be sensitive to noisy or missing at test time. With recent success of sequence (Seq2Seq) models machine translation, there an opportunity explore new...

10.1609/aaai.v33i01.33016892 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

With the increasing popularity of video sharing websites such as YouTube and Facebook, multimodal sentiment analysis has received attention from scientific community. Contrary to previous works in which focus on holistic information speech segments bag words representations average facial expression intensity, we propose a novel deep architecture for that is able perform modality fusion at word level. In this paper, Gated Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model...

10.1145/3136755.3136801 preprint EN 2017-11-03

Federated learning is a method of training models on private data distributed over multiple devices. To keep device private, the global model trained by only communicating parameters and updates which poses scalability challenges for large models. this end, we propose new federated algorithm that jointly learns compact local representations each across all As result, can be smaller since it operates representations, reducing number communicated parameters. Theoretically, provide...

10.48550/arxiv.2001.01523 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Computational modeling of human multimodal language is an emerging research area in natural processing spanning the language, visual and acoustic modalities. Comprehending requires not only interactions within each modality (intra-modal interactions) but more importantly between modalities (cross-modal interactions). In this paper, we propose Recurrent Multistage Fusion Network (RMFN) which decomposes fusion problem into multiple stages, them focused on a subset signals for specialized,...

10.18653/v1/d18-1014 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Learning multimodal representations is a fundamentally complex research problem due to the presence of multiple heterogeneous sources information. Although modalities provides additional valuable information, there are two key challenges address when learning from data: 1) models must learn intra-modal and cross-modal interactions for prediction 2) be robust unexpected missing or noisy during testing. In this paper, we propose optimize joint generative-discriminative objective across data...

10.48550/arxiv.1806.06176 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Federated learning is an emerging research paradigm enabling collaborative training of machine models among different organizations while keeping data private at each institution. Despite recent progress, there remain fundamental challenges such as the lack convergence and potential for catastrophic forgetting across real-world heterogeneous devices. In this paper, we demonstrate that self-attention-based architectures (e.g., Transformers) are more robust to distribution shifts hence improve...

10.1109/cvpr52688.2022.00982 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

As natural language processing methods are increasingly deployed in real-world scenarios such as healthcare, legal systems, and social science, it becomes necessary to recognize the role they potentially play shaping biases stereotypes. Previous work has revealed presence of widely used word embeddings involving gender, race, religion, other constructs. While some were proposed debias these word-level embeddings, there is a need perform debiasing at sentence-level given recent shift towards...

10.18653/v1/2020.acl-main.488 article EN cc-by 2020-01-01

Recently, classifying the modulation schemes of signals using deep neural network has received much attention. In this paper, we introduce a general model (DNN)-based classifiers for single-input single-output (SISO) systems. Its feasibility is analyzed maximum posteriori probability (MAP) criterion and its robustness to uncertain noise conditions compared that conventional likelihood (ML)-based classifiers. To reduce design training cost DNN classifiers, simple but effective pre-processing...

10.1109/tvt.2019.2951594 article EN IEEE Transactions on Vehicular Technology 2019-11-05

As intelligent systems increasingly blend into our everyday life, artificial social intelligence becomes a prominent area of research. Intelligent must be socially in order to comprehend human intents and maintain rich level interaction with humans. Human language offers unique unconstrained approach probe through questions reason answers about situations. This extends previous attempts model numeric supervision (e.g. sentiment emotions labels). In this paper, we introduce Social-IQ,...

10.1109/cvpr.2019.00901 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design computer agents capable of understanding, reasoning, and through integrating multiple communicative modalities, including linguistic, acoustic, visual, tactile, physiological messages. With the recent interest in video embodied autonomous agents, text-to-image generation, multisensor fusion healthcare robotics, multimodality has brought unique computational theoretical challenges community given...

10.1145/3610661.3617602 article EN 2023-10-09

There has been an increased interest in multimodal language processing including dialog, question answering, sentiment analysis, and speech recognition. However, naturally occurring data is often imperfect as a result of modalities, missing entries or noise corruption. To address these concerns, we present regularization method based on tensor rank minimization. Our the observation that high-dimensional time series exhibit correlations across modalities which leads to low-rank...

10.18653/v1/p19-1152 article EN cc-by 2019-01-01

Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities. The central challenge in multimodal involves representations that can process relate information from multiple In this paper, we propose two methods for unsupervised of joint using sequence to (Seq2Seq) methods: Seq2Seq Modality Translation Model Hierarchical Model. We also explore different variations on inputs outputs these seq2seq models. Our experiments sentiment analysis CMU-MOSI...

10.18653/v1/w18-3308 article EN cc-by 2018-01-01

Modulation classification using deep neural networks has recently received increasing attention due to its capability in learning rich features of data. In this paper, we propose a low- complexity blind data-driven modulation classifier. Our classifier operates robustly over Rayleigh fading channels under uncertain noise conditions modeled mixture three types noise, namely, white Gaussian non- and correlated non-Gaussian noise. The proposed consists several layers recurrent (RNN) which is...

10.1109/glocom.2018.8647582 article EN 2015 IEEE Global Communications Conference (GLOBECOM) 2018-12-01
Coming Soon ...