- Topic Modeling
- Natural Language Processing Techniques
- Sentiment Analysis and Opinion Mining
- Social Media and Politics
- Misinformation and Its Impacts
- Computational and Text Analysis Methods
- Advanced Text Analysis Techniques
- Hate Speech and Cyberbullying Detection
- Complex Network Analysis Techniques
- Legal principles and applications
- Software Engineering Research
- Advanced Graph Neural Networks
- Online Learning and Analytics
- Multimodal Machine Learning Applications
- Law, Economics, and Judicial Systems
- Mental Health via Writing
- Advanced Malware Detection Techniques
- Spam and Phishing Detection
- Intelligent Tutoring Systems and Adaptive Learning
- Semantic Web and Ontologies
- Opinion Dynamics and Social Influence
- Speech and dialogue systems
- Law, logistics, and international trade
- AI-based Problem Solving and Planning
- Insurance and Financial Risk Management
Purdue University West Lafayette
2016-2025
IT University of Copenhagen
2023
Tokyo Institute of Technology
2023
Administration for Community Living
2023
American Jewish Committee
2023
University of Pennsylvania
2023
Microsoft (United States)
2022
University of Colorado Boulder
2022
Imperial College London
2021
Saarland University
2021
Identifying the political perspective shaping way news events are discussed in media is an important and challenging task. In this paper, we highlight importance of contextualizing social information, capturing how information disseminated networks. We use Graph Convolutional Networks, a recently proposed neural architecture for representing relational to capture documents’ context. show that can be used effectively as source distant supervision, when direct supervision available, even...
Maintaining and cultivating student engagement is critical for learning. Understanding factors affecting will help in designing better courses improving retention. The large number of participants massive open online (MOOCs) data collected from their interaction with the MOOC up avenues studying at scale. In this work, we develop a framework modeling understanding based on behavioral cues. Our first contribution abstraction types using latent representations that probabilistic model to...
Automated Program Repair (APR) improves soft-ware reliability by generating patches for a buggy program automatically. Recent APR techniques leverage deep learning (DL) to build models learn generate from existing and code corpora. While promising, DL-based suffer the abundant syntactically or semantically incorrect in patch space. These often disobey syntactic semantic domain knowledge of source thus cannot be correct fix bug. We propose approach KNOD, which in-corporates guide generation...
Discussion forums serve as a platform for student discussions in massive open online courses (MOOCs).Analyzing content these can uncover useful information improving retention and help initiating instructor intervention.In this work, we explore the use of topic models, particularly seeded models toward goal.We demonstrate that features derived from analysis predicting survival.
Instructor intervention in student discussion forums is a vital component Massive Open Online Courses (MOOCs), where personalized interaction limited. This paper introduces the problem of predicting instructor interventions MOOC forums. We propose several prediction models designed to capture unique aspects MOOCs, combining course information, forum structure and posts content. Our abstract contents individual threads using latent categories, learned jointly with binary problem. Experiments...
Previous works in computer science, as well political and social have shown correlation text between ideologies the moral foundations expressed within that text. Additional work has policy frames, which are used by politicians to bias public towards their stance on an issue, also correlated with ideology. Based these associations, this takes a first step modeling both language how frame issues Twitter, order predict express stances issues. The contributions of includes dataset annotated for...
Phishing attacks continue to pose a major threat for computer system defenders, often forming the first step in multi-stage attack. There have been great strides made phishing detection; however, some emails appear pass through filters by making simple structural and semantic changes messages. We tackle this problem use of machine learning classifier operating on large corpus legitimate emails. design SAFe-PC (Semi-Automated Feature generation Phish Classification), extract features,...
There is a large variation in background and purpose of massive open online course (MOOC) learners. To improve the overall MOOC learning experience, it important to identify which characteristics are most for For this purpose, article, we analyzed about 150 000 open-ended learner responses from 810 MOOCs three postcourse survey questions their experience: (Q1) What was your favorite part why? (Q2) least (Q3) How could be improved? We used latent Dirichlet allocation topic model prominent...
Easy access, variety of content, and fast widespread interactions are some the reasons making social media increasingly popular. However, this rise has also enabled propagation fake news, text published by news sources with an intent to spread misinformation sway beliefs. Detecting it is important challenging problem prevent large scale maintain a healthy society. We view detection as reasoning over relations between sources, articles they publish, engaging users on in graph framework. After...
Fact-checking political discussions has become an essential clog in computational journalism. This task encompasses important sub-task---identifying the set of statements with 'check-worthy' claims. Previous work treated this as a simple text classification problem discounting nuances involved determining what makes check-worthy. We introduce dataset debates from 2016 US Presidential election campaign annotated using all major fact-checking media outlets and show that there is need to model...
Various domain users are increasingly leveraging real-time social media data to gain rapid situational awareness. However, due the high noise in deluge of data, effectively determining semantically relevant information can be difficult, further complicated by changing definition relevancy each end user for different events. The majority existing methods short text relevance classification fail incorporate users' knowledge into process. Existing that interactive feedback focus on historical...
In this paper, we suggest a minimally supervised approach for identifying nuanced frames in news article coverage of politically divisive topics. We to break the broad policy suggested by Boydstun et al., 2014 into fine-grained subframes which can capture differences political ideology better way. evaluate and their embedding, learned using minimal supervision, over three topics, namely, immigration, gun-control, abortion. demonstrate ability ideological analyze discourse media.
Automated attack discovery techniques, such as attacker synthesis or model-based fuzzing, provide powerful ways to ensure network protocols operate correctly and securely. Such in general, require a formal representation of the protocol, often form finite state machine (FSM). Unfortunately, many are only described English prose, implementing even simple protocol an FSM is time-consuming prone subtle logical errors. Automatically extracting FSMs from documentation can significantly contribute...
Politicians often use Twitter to express their beliefs, stances on current political issues, and reactions concerning national international events.Since politicians are scrutinized for what they choose or neglect say, craft statements carefully.Thus despite the limited length of tweets, content is highly indicative a politician's stances.We present weakly supervised method understanding held by politicians, wide array analyzing how issues framed in tweets temporal activity patterns.We...
Framing is a political strategy in which politicians carefully word their statements order to control public perception of issues. Previous works exploring framing typically analyze frame usage longer texts, such as congressional speeches. We present collection weakly supervised models harness collective classification predict the frames used discourse on microblogging platform, Twitter. Our global probabilistic show that by combining both lexical features tweets and network-based behavioral...
Modeling script knowledge can be useful for a wide range of NLP tasks. Current statistical learning approaches embed the events, such that their relationships are indicated by similarity in embedding. While intuitive, these fall short representing nuanced relations, needed downstream In this paper, we suggest to view event embedding as multi-relational problem, which allows us capture different aspects pairs. We model rich set Cause and Contrast, derived from Penn Discourse Tree Bank....
Building models for realistic natural language tasks requires dealing with long texts and accounting complicated structural dependencies. Neural-symbolic representations have emerged as a way to combine the reasoning capabilities of symbolic methods, expressiveness neural networks. However, most existing frameworks combining been designed classic relational learning that work over universe entities relations. In this paper, we present DRaiL, an open-source declarative framework specifying...
The Covid-19 pandemic has led to infodemic of low quality information leading poor health decisions. Combating the outcomes this is not only a question identifying false claims, but also reasoning about decisions individuals make. In work we propose holistic analysis framework connecting stance and reason analysis, fine-grained entity level moral sentiment analysis. We study how model dependencies between different incorporate human insights into learning process. Experiments show that our...