Claire Cardie

ORCID: 0000-0002-2061-6094
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Sentiment Analysis and Opinion Mining
  • Multimodal Machine Learning Applications
  • Speech and dialogue systems
  • Text Readability and Simplification
  • Misinformation and Its Impacts
  • Software Engineering Research
  • Text and Document Classification Technologies
  • Spam and Phishing Detection
  • Hate Speech and Cyberbullying Detection
  • Semantic Web and Ontologies
  • Data Mining Algorithms and Applications
  • Machine Learning and Data Classification
  • Machine Learning and Algorithms
  • Social Media and Politics
  • Expert finding and Q&A systems
  • Domain Adaptation and Few-Shot Learning
  • Wikis in Education and Collaboration
  • Complex Network Analysis Techniques
  • Web Data Mining and Analysis
  • AI-based Problem Solving and Planning
  • Algorithms and Data Compression
  • Anomaly Detection Techniques and Applications

Cornell University
2014-2023

The University of Texas at Dallas
2023

Bellevue Hospital Center
2022

Meta (Israel)
2021

Microsoft (United States)
2018-2019

Microsoft Research (United Kingdom)
2018

IBM Research - Zurich
2018

University of Pittsburgh
2018

University of North Carolina at Greensboro
2018

National Centre of Scientific Research "Demokritos"
2018

10.1007/s10579-005-7880-9 article EN Language Resources and Evaluation 2005-05-01

We present a noun phrase coreference system that extends the work of Soon et al. (2001) and, to our knowledge, produces best results date on MUC-6 and MUC-7 resolution data sets -F-measures 70.4 63.4,respectively.Improvements arise from two sources: extra-linguistic changes learning framework large-scale expansion feature set include more sophisticated linguistic knowledge.

10.3115/1073083.1073102 article EN 2001-01-01

We study automatic question generation for sentences from text passages in reading comprehension. introduce an attention-based sequence learning model the task and investigate effect of encoding sentence- vs. paragraph-level information. In contrast to all previous work, our does not rely on hand-crafted rules or a sophisticated NLP pipeline; it is instead trainable end-to-end via sequence-to-sequence learning. Automatic evaluation results show that system significantly outperforms...

10.18653/v1/p17-1123 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau, Janyce Wiebe. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). 2014.

10.3115/v1/s14-2010 article EN Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2014-01-01

Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Iñigo Lopez-Gazpio, Montse Maritxalar, Rada Mihalcea, German Rigau, Larraitz Uria, Janyce Wiebe. Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). 2015.

10.18653/v1/s15-2045 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2015-01-01

OpinionFinder is a system that performs subjectivity analysis, automatically identifying when opinions, sentiments, speculations, and other private states are present in text. Specifically, aims to identify subjective sentences mark various aspects of the these sentences, including source (holder) words included phrases expressing positive or negative sentiments.

10.3115/1225733.1225751 article EN 2005-01-01

Consumers increasingly rate, review and research products online. Consequently, websites containing consumer reviews are becoming targets of opinion spam. While recent work has focused primarily on manually identifiable instances spam, in this we study deceptive spam---fictitious opinions that have been deliberately written to sound authentic. Integrating from psychology computational linguistics, develop compare three approaches detecting ultimately a classifier is nearly 90% accurate our...

10.48550/arxiv.1107.4557 preprint EN other-oa arXiv (Cornell University) 2011-01-01

Recurrent neural networks (RNNs) are connectionist models of sequential data that naturally applicable to the analysis natural language. Recently, “depth in space” — as an orthogonal notion time” RNNs has been investigated by stacking multiple layers and shown empirically bring a temporal hierarchy architecture. In this work we apply these deep task opinion expression extraction formulated token-level sequence-labeling task. Experimental results show deep, narrow outperform traditional...

10.3115/v1/d14-1080 article EN cc-by 2014-01-01

Recent systems have been developed for sentiment classification, opinion recognition, and analysis (e.g., detecting polarity strength). We pursue another aspect of analysis: identifying the sources opinions, emotions, sentiments. view this problem as an information extraction task adopt a hybrid approach that combines Conditional Random Fields (Lafferty et al., 2001) variation AutoSlog (Riloff, 1996a). While CRFs model source identification sequence tagging task, learns patterns. Our results...

10.3115/1220575.1220620 article EN 2005-01-01

Determining the polarity of a sentiment-bearing expression requires more than simple bag-of-words approach. In particular, words or constituents within can interact with each other to yield particular overall polarity. this paper, we view such subsentential interactions in light compositional semantics, and present novel learning-based approach that incorporates structural inference motivated by semantics into learning procedure. Our experiments show (1) heuristics based on perform better...

10.3115/1613715.1613816 article EN 2008-01-01

The problem of event extraction requires detecting the trigger and extracting its corresponding arguments. Existing work in argument typically relies heavily on entity recognition as a preprocessing/concurrent step, causing well-known error propagation. To avoid this issue, we introduce new paradigm for by formulating it question answering (QA) task that extracts arguments an end-to-end manner. Empirical results demonstrate our framework outperforms prior methods substantially; addition, is...

10.18653/v1/2020.emnlp-main.49 article EN cc-by 2020-01-01

Consumers' purchase decisions are increasingly influenced by user-generated online reviews.Accordingly, there has been growing concern about the potential for posting deceptive opinion spamfictitious reviews that have deliberately written to sound authentic, deceive reader.In this paper, we explore generalized approaches identifying spam based on a new gold standard dataset, which is comprised of data from three different domains (i.e.Hotel, Restaurant, Doctor), each contains types reviews,...

10.3115/v1/p14-1147 article EN cc-by 2014-01-01

In recent years great success has been achieved in sentiment classification for English, thanks part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance labeled data. To tackle problem low-resource without adequate data, we propose Adversarial Deep Averaging Network (ADAN 1 ) transfer knowledge learned from data on a resource-rich source language where only unlabeled exist. ADAN two discriminative branches: classifier and...

10.1162/tacl_a_00039 article EN cc-by Transactions of the Association for Computational Linguistics 2018-12-01

Consumers' purchase decisions are increasingly influenced by user-generated online reviews. Accordingly, there has been growing concern about the potential for posting deceptive opinion spam---fictitious reviews that have deliberately written to sound authentic, deceive reader. But while this practice received considerable public attention and concern, relatively little is known actual prevalence, or rate, of deception in review communities, less still factors influence it.

10.1145/2187836.2187864 article EN 2012-04-16

We present a novel attention-based recurrent neural network for joint extraction of entity mentions and relations. show that attention along with long short term memory (LSTM) can extract semantic relations between without having access to dependency trees. Experiments on Automatic Content Extraction (ACE) corpora our model significantly outperforms feature-based by Li Ji (2014). also compare an end-to-end tree-based LSTM (SPTree) Miwa Bansal (2016) performs within 1% 2% Our fine-grained...

10.18653/v1/p17-1085 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

We investigate the efficacy of topic model based approaches to two multi-aspect sentiment analysis tasks: sentence labeling and rating prediction. For labeling, we propose a weakly-supervised approach that utilizes only minimal prior knowledge - in form seed words enforce direct correspondence between topics aspects. This is used label sentences with performance fully supervised baseline. prediction, find overall ratings can be conjunction our labelings achieve reasonable compared When...

10.1109/icdmw.2011.125 article EN 2011-12-01

Arzoo Katiyar, Claire Cardie. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1079 article EN cc-by 2018-01-01

We present DREAM, the first dialogue-based multiple-choice reading comprehension data set. Collected from English as a Foreign Language examinations designed by human experts to evaluate level of Chinese learners English, our set contains 10,197 questions for 6,444 dialogues. In contrast existing sets, DREAM is focus on in-depth multi-turn multi-party dialogue understanding. likely significant challenges systems: 84% answers are non-extractive, 85% require reasoning beyond single sentence,...

10.1162/tacl_a_00264 article EN cc-by Transactions of the Association for Computational Linguistics 2019-04-29
Coming Soon ...