Tong Niu

ORCID: 0009-0007-3453-0738
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Multimodal Machine Learning Applications
  • Speech and dialogue systems
  • Speech Recognition and Synthesis
  • Machine Learning in Healthcare
  • Software Engineering Research
  • Text Readability and Simplification
  • Advanced Text Analysis Techniques
  • Scientific Computing and Data Management
  • Semantic Web and Ontologies
  • Online Learning and Analytics
  • Biomedical Text Mining and Ontologies
  • Advanced Neural Network Applications
  • Adversarial Robustness in Machine Learning
  • Emotion and Mood Recognition
  • Advanced Sensor and Control Systems
  • Evolutionary Algorithms and Applications
  • Privacy-Preserving Technologies in Data
  • Human Pose and Action Recognition
  • Blockchain Technology Applications and Security
  • Privacy, Security, and Data Protection
  • Machine Learning in Materials Science
  • Speech and Audio Processing
  • Advanced Adaptive Filtering Techniques

Salesforce (United States)
2021-2023

PLA Information Engineering University
2022

Sichuan University
2021

University of North Carolina Health Care
2018-2020

University of North Carolina at Chapel Hill
2018-2020

System Equipment (China)
2018

North China University of Technology
2017

Harbin University of Science and Technology
2014

Stylistic dialogue response generation, with valuable applications in personality-based conversational agents, is a challenging task because the needs to be fluent, contextually-relevant, as well paralinguistically accurate. Moreover, parallel datasets for regular-to-stylistic pairs are usually unavailable. We present three weakly-supervised models that can generate diverse, polite (or rude) responses without data. Our late fusion model (Fusion) merges decoder of an encoder-attention-decoder...

10.1162/tacl_a_00027 article EN cc-by Transactions of the Association for Computational Linguistics 2018-12-01

Sweta Karlekar, Tong Niu, Mohit Bansal. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018.

10.18653/v1/n18-2110 article EN cc-by 2018-01-01

We present two categories of model-agnostic adversarial strategies that reveal the weaknesses several generative, task-oriented dialogue models: Should-Not-Change evaluate over-sensitivity to small and semantics-preserving edits, as well Should-Change test if a model is over-stable against subtle yet semantics-changing modifications. next perform training with each strategy, employing max-margin approach for negative generative examples. This not only makes target more robust inputs, but...

10.18653/v1/k18-1047 article EN cc-by 2018-01-01

Dialogue state trackers have made significant progress on benchmark datasets, but their generalization capability to novel and realistic scenarios beyond the held-out conversations is less understood. We propose controllable counterfactuals (CoCo) bridge this gap evaluate dialogue tracking (DST) models scenarios, i.e., would system successfully tackle request if user responded differently still consistently with flow? CoCo leverages turn-level belief states as counterfactual conditionals...

10.48550/arxiv.2010.12850 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Asking good questions is an essential ability for both human and machine intelligence. However, existing neural question generation approaches mainly focus on short factoid type of answers. In this paper, we introduce a generator, MixQG, to bridge gap. We combine nine answering datasets with diverse answer types, including yes/no, multiple-choice, extractive, abstractive answers, train single generative model. show empirical results that our model outperforms work in seen unseen domains, can...

10.18653/v1/2022.findings-naacl.111 article EN cc-by Findings of the Association for Computational Linguistics: NAACL 2022 2022-01-01

Tong Niu, Mohit Bansal. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1132 article EN cc-by 2019-01-01

Paraphrase generation has benefited extensively from recent progress in the designing of training objectives and model architectures. However, previous explorations have largely focused on supervised methods, which require a large amount labeled data that is costly to collect. To address this drawback, we adopt transfer learning approach propose pipeline enables pre-trained language models generate high-quality paraphrases an unsupervised setting. Our recipe consists task-adaptation,...

10.18653/v1/2021.emnlp-main.417 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

We explore the design of Marvista—a human-AI collaborative tool that employs a suite natural language processing models to provide end-to-end support for reading online news articles. Before an article, Marvista helps user plan what read by filtering text based on how much time one can spend and questions is interested find out from article. During reading, reflect their understanding each paragraph with AI-generated questions. After generates explainable summary combines AI’s text, user’s...

10.1145/3609331 article EN ACM Transactions on Computer-Human Interaction 2023-07-21

Despite significant advancements in the general capability of large language models (LLMs), they continue to struggle with consistent and accurate reasoning, especially complex tasks such as mathematical code reasoning. One key limitation is that LLMs are trained primarily on correct solutions, reducing their ability detect learn from errors, which hampers reliably verify rank outputs. To address this, we scale up inference-time computation by generating multiple reasoning paths employing...

10.48550/arxiv.2410.05318 preprint EN arXiv (Cornell University) 2024-10-05

Automatic data augmentation (AutoAugment) (Cubuk et al., 2019) searches for optimal perturbation policies via a controller trained using performance rewards of sampled policy on the target task, hence reducing data-level model bias. While being powerful algorithm, their work has focused computer vision tasks, where it is comparatively easy to apply imperceptible perturbations without changing an image's semantic meaning. In our work, we adapt AutoAugment automatically discover effective...

10.48550/arxiv.1909.12868 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Large Language Models (LLMs) have become ubiquitous across various domains, transforming the way we interact with information and conduct research. However, most high-performing LLMs remain confined behind proprietary walls, hindering scientific progress. Most open-source LLMs, on other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context. To address this, trained XGen, series of 7B...

10.48550/arxiv.2309.03450 preprint EN other-oa arXiv (Cornell University) 2023-01-01

In Location Based Services (LBSs), service providers can obtain mobile users' locations or traces while receiving their requests. K-anonymity, which is the most commonly used location privacy protection method, needs cooperation among users to form a k-anonymous group. Though several incentive mechanisms have been proposed motivate participate in group, of them rely on `trustful' center. this paper, we propose distributed secure mechanism that applies blockchain smart contracts for...

10.1109/pac.2017.33 article EN 2017-08-01

Text compression has diverse applications such as Summarization, Reading Comprehension and Editing. However, almost all existing approaches require either hand-crafted features, syntactic labels or parallel data. Even for one that achieves this task in an unsupervised setting, its architecture necessitates a task-specific autoencoder. Moreover, these models only generate compressed sentence each source input, so adapting to different style requirements (e.g. length) the final output usually...

10.48550/arxiv.1909.03223 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Human-computer interaction (HCI) is a multidisciplinary field of study focusing on the design computer technology and, in particular, interactions between humans and computers. Public space, urban buildings, an open accessible area to people. life, happening public spaces, about human activity, interaction, expression feeling wild. Affective behavior analysis space basic topic life research, which key achieve HCI applications through comprehensively understanding people's feelings, emotions,...

10.1109/iccvw54120.2021.00404 article EN 2021-10-01

Stylistic dialogue response generation, with valuable applications in personality-based conversational agents, is a challenging task because the needs to be fluent, contextually-relevant, as well paralinguistically accurate. Moreover, parallel datasets for regular-to-stylistic pairs are usually unavailable. We present three weakly-supervised models that can generate diverse polite (or rude) responses without data. Our late fusion model (Fusion) merges decoder of an encoder-attention-decoder...

10.48550/arxiv.1805.03162 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Many sequence-to-sequence dialogue models tend to generate safe, uninformative responses. There have been various useful efforts on trying eliminate them. However, these approaches either improve decoding algorithms during inference, rely hand-crafted features, or employ complex models. In our work, we build that are dynamically aware of what utterances tokens dull without any feature-engineering. Specifically, start with a simple yet effective automatic metric, AvgOut, which calculates the...

10.1609/aaai.v34i05.6378 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Aligning parallel sentences in multilingual corpora is essential to curating data for downstream applications such as Machine Translation. In this work, we present OneAligner, an alignment model specially designed sentence retrieval tasks. This able train on only one language pair and transfers, a cross-lingual fashion, low-resource pairs with negligible degradation performance. When trained all of large-scale corpus (OPUS-100), achieves the state-of-the-art result Tateoba dataset,...

10.18653/v1/2022.findings-acl.226 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

The field of natural language generation has witnessed significant advancements in recent years, including the development controllable text techniques. However, controlling attributes generated remains a challenge, especially when aiming to avoid undesirable behavior such as toxicity. In this work, we introduce Detoxification Generator (DETOXIGEN), an inference-time algorithm that steers away from unwanted styles. DETOXIGEN is ensemble pre-trained model (generator) and detoxifier....

10.48550/arxiv.2401.06947 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Recently, many studies focus on utilizing large language models (LLMs) into educational dialogues. Especially, within liberal arts dialogues, educators must balance \textbf{H}umanized communication, \textbf{T}eaching expertise, and \textbf{S}afety-ethics (\textbf{HTS}), besides the subject knowledge itself. However, due to collecting massive amounts of HTS-compliant teaching dialogues from real world as training corpus is expensive, outputs existing LLMs in fall short human standards. To...

10.48550/arxiv.2409.15461 preprint EN arXiv (Cornell University) 2024-09-23

We explore the design of Marvista -- a human-AI collaborative tool that employs suite natural language processing models to provide end-to-end support for reading online news articles. Before an article, helps user plan what read by filtering text based on how much time one can spend and questions is interested find out from article. During reading, reflect their understanding each paragraph with AI-generated questions. After generates explainable summary combines both AI's text, user's...

10.48550/arxiv.2207.08401 preprint EN cc-by arXiv (Cornell University) 2022-01-01
Coming Soon ...