- Topic Modeling
- Natural Language Processing Techniques
- Advanced Text Analysis Techniques
- Sentiment Analysis and Opinion Mining
- Web Data Mining and Analysis
- Text and Document Classification Technologies
- Hate Speech and Cyberbullying Detection
- Algorithms and Data Compression
- Misinformation and Its Impacts
- Recommender Systems and Techniques
- Biomedical Text Mining and Ontologies
- Semantic Web and Ontologies
- Software Engineering Research
- Complex Network Analysis Techniques
- Wikis in Education and Collaboration
- Spam and Phishing Detection
- Cloud Computing and Resource Management
- Information Retrieval and Search Behavior
- Image Retrieval and Classification Techniques
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Software Engineering Techniques and Practices
- Authorship Attribution and Profiling
- Data Quality and Management
- Computational Drug Discovery Methods
Indian Institute of Technology Hyderabad
2015-2024
International Institute of Information Technology, Hyderabad
2014-2023
International Institute of Information Technology
2007-2023
The Sense Innovation and Research Center
2023
Vellore Institute of Technology University
2023
Adobe Systems (United States)
2022
Tata Consultancy Services (India)
2018
International Institute of Islamic Thought
2016
Universitat Politècnica de València
2014
University of Hyderabad
2009-2011
Hate speech detection on Twitter is critical for applications like controversial event extraction, building AI chatterbots, content recommendation, and sentiment analysis. We define this task as being able to classify a tweet racist, sexist or neither. The complexity of the natural language constructs makes very challenging. perform extensive experiments with multiple deep learning architectures learn semantic word embeddings handle complexity. Our benchmark dataset 16K annotated tweets show...
In recent times, fake news and misinformation have had a disruptive adverse impact on our lives. Given the prominence of microblogging networks as source for most individuals, now spreads at faster pace has more profound than ever before. This makes detection an extremely important challenge. Fake articles, just like genuine leverage multimedia content to manipulate user opinions but spread misinformation. A shortcoming current approaches is their inability learn shared representation...
With the ever-increasing cases of hate spread on social media platforms, it is critical to design abuse detection mechanisms proactively avoid and control such incidents. While there exist methods for speech detection, they stereotype words hence suffer from inherently biased training. Bias removal has been traditionally studied structured datasets, but we aim at bias mitigation unstructured text data. In this paper, make two important contributions. First, systematically quantify any model...
This paper describes our system (Fermi) for Task 5 of SemEval-2019: HatEval: Multilingual Detection Hate Speech Against Immigrants and Women on Twitter. We participated in the subtask A English ranked first evaluation test set. evaluate quality multiple sentence embeddings explore training models to performance simple yet effective embedding-ML combination algorithms. Our team - Fermi's model achieved an accuracy 65.00% language task A. models, which use pretrained Universal Encoder...
We present, discuss and evaluate a hybrid approach of live migrating virtual machine across hosts in Gigabit LAN. Our takes the best both traditional methods migration - pre post-copy. In pre-copy, cpu state memory is transferred before spawning VM on destination host whereas latter exactly opposite spawns right after transferring processor state. our approach, addition to state, we bundle lot useful information. This includes devices frequently accessed pages VM, aka working set....
Sentiment analysis (SA) using code-mixed data from social media has several applications in opinion mining ranging customer satisfaction to campaign multilingual societies. Advances this area are impeded by the lack of a suitable annotated dataset. We introduce Hindi-English (Hi-En) dataset for sentiment and perform empirical comparing suitability performance various state-of-the-art SA methods media. In paper, we learning sub-word level representations LSTM (Subword-LSTM) architecture...
Scientific article recommendation problem deals with recommending similar scientific articles given a query article. It can be categorized as content based similarity system. Recent advancements in representation learning methods have proven to effective modeling distributed representations different modalities like images, languages, speech, networks etc. The obtained using such techniques turn used calculate similarities. In this paper, we address the of paper through novel method which...
Pulkit Parikh, Harika Abburi, Pinkesh Badjatiya, Radhika Krishnan, Niyati Chhaya, Manish Gupta, Vasudeva Varma. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.
Social media is a useful platform to share health-related information due its vast reach. This makes it good candidate for public-health monitoring tasks, specifically pharmacovigilance. We study the problem of extraction Adverse-Drug-Reaction (ADR) mentions from social media, particularly Twitter. Medical challenging, mainly short and highly informal nature text, as compared more technical formal medical reports. Current methods in ADR mention rely on supervised learning methods, which...
Ads on the search engine (SE) are generally ranked based their Click-through rates (CTR). Hence, accurately predicting CTR of an ad is paramount importance for maximizing SE's revenue. We present a model that inherits click information rare/new ads from other semantically related ads. The semantic features derived query click-through graphs and advertisers account information. show learned using these give very good prediction values.
MapReduce framework has received a wide acclaim over the past few years for large scale computing. It become standard paradigm batch oriented workloads. As adoption of this increased rapidly, scheduling these jobs problem great interest in research community. We propose an approach which tries to maintain harmony among running on cluster, and turn decrease their runtime. In our model, scheduler is made aware different types cluster. The allocate task node if incoming does not affect tasks...
Priya Radhakrishnan, Partha Talukdar, Vasudeva Varma. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.
An effective news recommendation system should harness the historical information of user based on her interactions as well content articles. In this paper we propose a novel deep learning model for which utilizes articles sequence in were read by user. To both these information, are essentially different types, simple yet architecture 3-dimensional Convolutional Neural Network takes word embeddings present history its input. Using such method endows with capability to automatically learn...
Online media outlets, in a bid to expand their reach and subsequently increase revenue through ad monetisation, have begun adopting clickbait techniques lure readers click on articles. The article fails fulfill the promise made by headline. Traditional methods for detection relied heavily feature engineering which, turn, is dependent dataset it built for. application of neural networks this task has only been explored partially. We propose novel approach considering all information found...
Given the recent progress in language modeling using Transformer-based neural models and an active interest generating stylized text, we present approach to leverage generalization capabilities of a model rewrite input text target author's style. Our proposed adapts pre-trained generate author-stylized by fine-tuning on author-specific corpus denoising autoencoder (DAE) loss cascaded encoder-decoder framework. Optimizing over DAE allows our learn nuances style without relying parallel data,...
Sexism, an injustice that subjects women and girls to enormous suffering, manifests in blatant as well subtle ways. In the wake of growing documentation experiences sexism on web, automatic categorization accounts has potential assist social scientists policymakers studying thereby countering sexism. The existing work classification certain limitations terms categories used and/or whether they can co-occur. To best our knowledge, this is first multi-label any kind(s). 1 We also consider...