- Complex Network Analysis Techniques
- Privacy-Preserving Technologies in Data
- Topic Modeling
- Social Media and Politics
- Opinion Dynamics and Social Influence
- Misinformation and Its Impacts
- Spam and Phishing Detection
- Privacy, Security, and Data Protection
- Data Visualization and Analytics
- Natural Language Processing Techniques
- Data-Driven Disease Surveillance
- Hate Speech and Cyberbullying Detection
- Internet Traffic Analysis and Secure E-voting
- Human Mobility and Location-Based Analysis
- Data Quality and Management
- Data Mining Algorithms and Applications
- Advanced Text Analysis Techniques
- Web Data Mining and Analysis
- Data Management and Algorithms
- Network Security and Intrusion Detection
- Semantic Web and Ontologies
- Terrorism, Counterterrorism, and Political Violence
- Cryptography and Data Security
- Advanced Graph Neural Networks
- Bioinformatics and Genomic Networks
Georgetown University
2016-2025
University of Saint Mary
2022
University of Calgary
2019
United States Census Bureau
2019
Peterson Institute for International Economics
2019
International Paper (United States)
2019
Los Alamitos Medical Center
2019
National Bureau of Economic Research
2019
Laboratoire d'Informatique de Paris-Nord
2016
University of Maryland, College Park
2009
Since December 2019, COVID-19 has been spreading rapidly across the world. Not surprisingly, conversation about is also increasing. This article a first look at amount of taking place on social media, specifically Twitter, with respect to COVID-19, themes discussion, where discussion emerging from, myths shared virus, and how much it connected other high low quality information Internet through URL links. Our preliminary findings suggest that meaningful spatio-temporal relationship exists...
Animal tool use is of inherent interest given its relationship to intelligence, innovation and cultural behaviour. Here we investigate whether Shark Bay bottlenose dolphins that marine sponges as hunting tools (spongers) are culturally distinct from other in the population based on criteria sponging both socially learned distinguishes between groups. We social network analysis determine preferences among 36 spongers 69 non-spongers sampled over a 22-year period while controlling for...
Detecting stance on Twitter is especially challenging because of the short length each tweet, continuous coinage new terminology and hashtags, deviation sentence structure from standard prose. Fine-tuned language models using large-scale in-domain data have been shown to be state-of-the-art for many NLP tasks, including detection. In this paper, we propose a novel BERT-based fine-tuning method that enhances masked model Instead random token masking, weighted log-odds-ratio identify words...
Community structure is ubiquitous in biological networks. There has been an increased interest unraveling the community of systems as it may provide important insights into a system's functional components and impact local structures on dynamics at global scale. Choosing appropriate detection algorithm to identify empirical network can be difficult, however, many algorithms available are based variety cost functions difficult validate. Even when identified system, disentangling effect from...
This article investigates the prevalence of high and low quality URLs shared on Twitter when users discuss COVID-19. We distinguish between health sources, traditional news misinformation sources. find that misinformation, in terms tweets containing from websites, is at a higher rate than information websites. However, both are relatively small proportion overall conversation. In contrast, sources much rate. These findings lead us to analyze network created by referenced webpages users. When...
Background: Twitter is becoming an important tool in medicine, but there little information on metrics. In order to recommend best practices for dissemination and diffusion, it first study analyze the networks.
Concerns about the trustworthiness, fairness, and privacy of AI systems are growing, strategies for mitigating these concerns still in their infancy. One approach to improve trustworthiness fairness is use bias mitigation algorithms. However, most algorithms require data sets that contain sensitive attribute values assess algorithm. A growing number real world do not make information readily available researchers. solution infer missing apply an existing algorithm using this inferred...
In this paper, we present a case study describing the privacy and trust that exist within small population of online social network users. We begin by formally characterizing different graphs in sites like Facebook. then determine how often people are willing to divulge personal details an unknown user, adversary. While most users our sample did not share sensitive information when asked adversary, found more were adversary if there is mutual friend connected user. summarize results...
Abstract Algorithmic decision making is becoming more prevalent, increasingly impacting people’s daily lives. Recently, discussions have been emerging about the fairness of decisions made by machines. Researchers proposed different approaches for improving these algorithms. While can help machines make fairer decisions, they developed and validated on fairly clean data sets. Unfortunately, most real-world complexities that them dirty . This work considers two analyzing impact issues...
Social networks continue to become more and feature rich. Using local global structural properties descriptive attributes are necessary for sophisticated social network analysis support visual mining tasks. While a number of visualization tools applications have been developed, most them limited uni-modal graph representations. Some the wide range options, including interactive views. Others better calculating such as density or deploying traditional statistical analysis. We present Invenio,...
While re-identification of sensitive data has been studied extensively, with the emergence online social networks and popularity digital communications, ability to use public for increased. This work begins by presenting two different cases studies re-identification. We conclude that targeted using traditional variables is not only possible, but fairly straightforward given large amount available. However, our first case study also indicates large-scale less likely. then consider methods...
Query logs are valuable resources for Information Retrieval (IR) research. However, because they also rich in private and personal information, the huge concern of leaking user privacy prevents query from being shared search companies to broad research community. Bothered by lack good data years, authors this paper motivated explore ways generate anonymized that can still be effectively used support task. We introduce a framework anonymize differential privacy, latest development The is...
C-Group is a tool for analyzing dynamic group membership in temporal social networks over time. Unlike most network visualization tools, which show the structure within an entire network, or single actor, allows users to focus their analysis on pair of individuals. While viewing addition and deletion nodes (actors) edges (relationships) time, its major contribution changing memberships By doing so, can investigate context pair. provides with flexible interface defining (and redefining)...
Worldwide displacement due to war and conflict is at all-time high. Unfortunately, determining if, when, where people will move a complex problem. This paper proposes integrating both publicly available organic data from social media newspapers with more traditional indicators of forced migration determine when move. We combine movement variables spatial temporal variation within different Bayesian models show the viability our method using case study involving in Iraq. Our analysis shows...
Identifying extremist-associated conversations on Twitter is an open problem. Extremist groups have been leveraging (1) to spread their message and (2) gain recruits. In this paper, we investigate the problem of determining whether a particular user engages in extremist conversation. We explore different metrics as proxies for misbehavior, including sentiment user's published tweets, polarity ego-network, mentions. compare known classifiers using these features manually annotated tweets...
Article Free Access Share on Generating association rules from semi-structured documents using an extended concept hierarchy Authors: Lisa Singh Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL ILView Profile , Peter Scheuermann Bin Chen Authors Info & Claims CIKM '97: Proceedings the sixth international conference Information knowledge managementJanuary 1997 Pages 193–200https://doi.org/10.1145/266714.266895Published:01 January 1997Publication History...
While privacy preservation of data mining approaches has been an important topic for a number years, social network is relatively new area interest. Previous research shown that anonymization alone may not be sufficient hiding identity information on certain real world sets. In this paper, we focus understanding the impact topology and node substructure level anonymity present in network. We measure, topological anonymity, quantifies amount preserved different structures. The measure uses...
Generative AI models continue to become more powerful. The launch of ChatGPT in November 2022 has ushered a new era AI. and other similar chatbots have range capabilities, from answering student homework questions creating music art. There are already concerns that humans may be replaced by for variety jobs. Because the wide spectrum data built on, we know they will human errors biases into them. These cause significant harm and/or inequity toward different subpopulations. To understand...