Soumya Sarkar

ORCID: 0000-0001-8302-4734
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Complex Network Analysis Techniques
  • Opinion Dynamics and Social Influence
  • Wikis in Education and Collaboration
  • Topic Modeling
  • Advanced Graph Neural Networks
  • Natural Language Processing Techniques
  • Peer-to-Peer Network Technologies
  • Social Media and Politics
  • Hate Speech and Cyberbullying Detection
  • Bioinformatics and Genomic Networks
  • Cancer-related gene regulation
  • Graph theory and applications
  • Mental Health Research Topics
  • Monoclonal and Polyclonal Antibodies Research
  • Social Capital and Networks
  • Effects of Radiation Exposure
  • Media Influence and Politics
  • Rough Sets and Fuzzy Logic
  • Game Theory and Applications
  • Cancer survivorship and care
  • Cutaneous lymphoproliferative disorders research
  • Populism, Right-Wing Movements
  • Maternal and Perinatal Health Interventions
  • vaccines and immunoinformatics approaches
  • Misinformation and Its Impacts

All India Institute of Medical Sciences
2024

Medanta The Medicity
2023

Microsoft Research (India)
2023

Indian Institute of Technology Kharagpur
2016-2022

Technical University of Darmstadt
2021-2022

RWTH Aachen University
2019

Indian Institute of Technology Patna
2015-2018

Information Technology University
2017

University of Burdwan
2017

With the ongoing debate on 'freedom of speech' vs. 'hate speech,' there is an urgent need to carefully understand consequences inevitable culmination two, i.e., hate over time. An ideal scenario this would be observe effects speech in (almost) unrestricted environment. Hence, we perform first temporal analysis Gab.com, a social media site with very loose moderation policy. We generate snapshots Gab from millions posts and users. Using these snapshots, compute activity vector based DeGroot...

10.1145/3415163 article EN Proceedings of the ACM on Human-Computer Interaction 2020-10-14

With the ongoing debate on 'freedom of speech' vs. 'hate there is an urgent need to carefully understand consequences inevitable culmination two, i.e., hate over time. An ideal scenario this would be observe effects speech in (almost) unrestricted environment. Hence, we perform first temporal analysis Gab.com, a social media site with very loose moderation policy. We generate snapshots Gab from millions posts and users. Using these snapshots, compute activity vector based DeGroot model...

10.48550/arxiv.1909.10966 preprint EN other-oa arXiv (Cornell University) 2019-01-01

In this paper, we explore how the C4.5 algorithm can be applied to breast cancer datasets in order extract and formulate rules for identifying risk factors. For study, have used Wisconsin dataset containing 9 attributes related various cell features anomalies. We then that create a decision tree. From inferred tree, patients at been derived. With training-set size of 200 patient records, our system was found an accuracy 96.7%.

10.26483/ijarcs.v8i8.4602 article EN cc-by International Journal of Advanced Research in Computer Science 2017-08-30

Wikipedia can easily be justified as a behemoth, considering the sheer volume of content that is added or removed every minute to its several projects. This creates an immense scope, in field natural language processing toward developing automated tools for moderation and review. In this paper we propose Self Attentive Revision Encoder (StRE) which leverages orthographic similarity lexical units predicting quality new edits. contrast existing propositions primarily employ features like page...

10.18653/v1/p19-1387 article EN cc-by 2019-01-01

With the widespread use of knowledge graphs (KG) in various automated AI systems and applications, it is very important to ensure that information retrieval algorithms leveraging them are free from societal biases. Previous works have depicted biases persist KGs, as well employed several metrics for measuring However, such studies lack systematic exploration sensitivity bias measurements, through varying sources data, or embedding used. To address this research gap, work, we present a...

10.1145/3578503.3583620 article EN 2023-04-26

Recent advances in the field of network representation learning are mostly attributed to application skip-gram model context graphs. State-of-the-art analogues graphs define a notion neighbourhood and aim find vector for node, which maximizes likelihood preserving this neighborhood. In paper, we take drastic departure from existing node by utilizing idea coreness. More specifically, utilize well-established that nodes with similar core numbers play equivalent roles hence induce novel an...

10.1109/asonam.2018.8508693 article EN 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) 2018-08-01

Millions of people irrespective socioeconomic and demographic backgrounds, depend on Wikipedia articles everyday for keeping themselves informed regarding popular as well obscure topics. Articles have been categorized by editors into several quality classes, which indicate their reliability encyclopedic content. This manual designation is an onerous task because it necessitates profound knowledge about language, navigating circuitous set wiki guidelines. In this paper we propose Neural...

10.18653/v1/2020.emnlp-main.674 article EN cc-by 2020-01-01

Networks created from real-world data contain some inaccuracies or noise, manifested as small changes in the network structure. An important question is whether these can signficantly affect analysis results.

10.1145/2983323.2983692 article EN 2016-10-24

Many scale-free networks exhibit a "rich club" structure, where high degree vertices form tightly interconnected subgraphs. In this paper, we explore the emergence of clubs" in context shortest path based centrality metrics. We term these subgraphs connected closeness or betweeness as rich clubs (RCC). Our experiments on real world and synthetic high- light inter-relations between RCCs, expander graphs, core-periphery structure network. show empirically theoretically that RCCs exist, if...

10.1145/3269206.3271763 article EN 2018-10-17

In this paper, we introduce social yield, a measure of collaboration success the collaborating authors in coauthorship network. We then attempt to empirically observe link dynamics networks induced by yield collaborations. Observation indicate that certain observed behavior like presence large number small sized communities and highly dynamic links can be explained based on distribution these It is also among collaborations affects resilience targeted removal.

10.1145/2808797.2808835 article EN 2015-08-25

In this paper we evaluate the effect of noise on community scoring and centrality-based parameters with respect to two different aspects network analysis: (i) sensitivity, that is how parameter value changes as edges are removed (ii) reliability in context message spreading, time taken broadcast a removed. Our experiments synthetic real-world networks three models demonstrate for both over all models, permanence qualifies most effective metric. For sensitivity closeness centrality close...

10.1109/asonam.2016.7752215 article EN 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) 2016-08-01

Wikipedia has been turned into an immensely popular crowd-sourced encyclopedia for information dissemination on numerous versatile topics in the form of subscription free content. It allows anyone to contribute so that articles remain comprehensive and updated. For enrichment content without compromising standards, community enumerates a detailed set guidelines, which should be followed. Based these, are categorized several quality classes by editors with increasing adherence guidelines....

10.1145/3512959 article EN Proceedings of the ACM on Human-Computer Interaction 2022-03-30

10.1007/s13278-021-00749-9 article EN Social Network Analysis and Mining 2022-02-07

In this paper we evaluate the effect of noise on community scoring and centrality-based parameters with respect to two different aspects network analysis: (i) sensitivity, that is how parameter value changes as edges are removed (ii) reliability in context message spreading, time taken broadcast a removed. Our experiments synthetic real-world networks three models demonstrate for both over all models, permanence qualifies most effective metric. For sensitivity closeness centrality close...

10.5555/3192424.3192437 article EN 2016-08-18

Abstract The evolution of Artificial Intelligence (AI)‐based systems and applications have pervaded everyday life to make decisions that a momentous impact on individuals society. With the staggering growth online data, often termed as infosphere , it has become paramount monitor ensure social good AI‐based are severely dependent. This survey aims provide comprehensive review some most important research areas related infosphere, focusing technical challenges potential solutions. also...

10.1002/widm.1453 article EN Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery 2022-02-28
Coming Soon ...