- Topic Modeling
- Natural Language Processing Techniques
- Imbalanced Data Classification Techniques
- Domain Adaptation and Few-Shot Learning
- Anomaly Detection Techniques and Applications
- Advanced Text Analysis Techniques
- Data Stream Mining Techniques
- Network Security and Intrusion Detection
- Web Data Mining and Analysis
- Recommender Systems and Techniques
- Machine Learning and Data Classification
- Sentiment Analysis and Opinion Mining
- Multimodal Machine Learning Applications
- Machine Learning and ELM
- Text and Document Classification Technologies
- COVID-19 Digital Contact Tracing
- Data-Driven Disease Surveillance
- Face and Expression Recognition
- Speech Recognition and Synthesis
- Human Mobility and Location-Based Analysis
- Electricity Theft Detection Techniques
- Financial Distress and Bankruptcy Prediction
- Face recognition and analysis
- Speech and dialogue systems
- Network Traffic and Congestion Control
Indian Institute of Technology Indore
2021-2025
Indian Institute of Technology Roorkee
2014-2018
IBM Research - India
2018
Jawaharlal Nehru University
2012
Anomaly detection is an important task in many real world applications such as fraud detection, suspicious activity health care monitoring etc. In this paper, we tackle problem from supervised learning perspective online setting. We maximize well known Gmean metric for class-imbalance framework. Specifically, show that maximizing equivalent to minimizing a convex surrogate loss function and based on propose novel algorithm anomaly detection. then show, by extensive experiments, the...
Anomaly detection has drawn a slew of attention in recent years, although term been known as outlier statistics several decades ago. Everyday large volume data is being generated. For example, flight navigation data, health care monitoring social media video surveillance etc. This contains rare events or anomalous points that needs to be found out-for example less than 2 % all visitors who visits Amazon website make purchase. Thus anomaly problem can interesting due business perspective,...
Catastrophic forgetting is a prominent challenge in machine learning. It denotes the phenomenon wherein models undergo significant loss of previously acquired knowledge upon learning new information. Supervised Continual Learning (SCL) has emerged as promising approach to mitigate this issue by enabling adapt non-stationary data distributions while leveraging labeled data. However, practical limitations arise for SCL real-world settings, where scarce. Conversely, Unsupervised (UCL) aims...
In the present work, study on class imbalance problems in a <i>distributed</i> setting exploiting sparsity structure data has been carried out. We formulate class-imbalance learning problem as cost-sensitive with <inline-formula><tex-math notation="LaTeX">$L_1$</tex-math></inline-formula> regularization. The loss function is cost-weighted smooth hinge loss. resultant optimization minimized within <i>Distributed Alternating Direction Method of Multiplier</i> (DADMM) framework. partition...
Anomaly is defined as a state of the system that do not conform to normal behavior. For example, emission neutrons in nuclear reactor channel above specified threshold an anomaly. Big data refers set \emph{high volume, streaming, heterogeneous, distributed} and often \emph{sparse}. uncommon these days. per Internet live stats, number tweets posted day has gone 500 millions. Due explosion laden domains, traditional anomaly detection techniques developed for small sets scale poorly on...
The ubiquitous nature of the Internet and its applicability as an inexpensive advertising media has resulted in Web mobile platforms to be on target for product advertising. Existing approaches different social include sponsored search, e-mail advertising, banner ads, etc., attract customers. current systems these require advertisers manually come up with a catchy tagline, which is time-consuming challenging task. To overcome this problem, we propose novel framework automatically generate...
The procure to pay process (P2P) in large enterprises is a back-end business which deals with the procurement of products and services for enterprise operations. Procurement done by issuing purchase orders impaneled vendors invoices submitted are paid after they go through rigorous validation process. Agents orchestrating P2P often encounter problem matching product or service descriptions invoice those order verify if ordered items what have been supplied serviced. For example, description...
Speech-to-Text (ST) is the translation of speech in one language to text another language. Earlier models for ST used a pipeline approach combining automatic recognition (ASR) and machine (MT). Such suffer from cascade error propagation, high latency memory consumption. Therefore, End-to-End (E2E) were proposed. Adapting E2E new pairs results deterioration performance on previously trained pairs. This phenomenon called Catastrophic Forgetting (CF). we need that can learn continually. The...
Speech-to-Speech Translation (S2ST) models transform speech from one language to another target with the same linguistic information. S2ST is important for bridging communication gap among communities and has diverse applications. In recent years, researchers have introduced direct models, which potential translate without relying on intermediate text generation, better decoding latency, ability preserve paralinguistic non-linguistic features. However, yet achieve quality performance...
Automatic evaluation of essay (AES) and also called automatic scoring has become a severe problem due to the rise online learning platforms such as Coursera, Udemy, Khan academy, so on. Researchers have recently proposed many techniques for evaluation. However, these use hand-crafted features thus are limited from feature representation point view. Deep emerged new paradigm in machine which can exploit vast data identify useful To this end, we propose novel architecture based on recurrent...
Conventional statistical analysis of Internet traffic data is often employed to determine distribution, summarize users behavior patterns, or predict future network for management and planning. However, the techniques like autoregressive integrated moving average (ARIMA) models fail capture some peculiar characteristics Self-similarity, Long-range dependency (LRD), etc. With rapid growth Internet, accurate reliable forecasting essential resource management. Therefore, present work explores...
The presence of sarcasm in conversational systems and social media like chatbots, Facebook, Twitter, etc. poses several challenges for downstream NLP tasks. This is attributed to the fact that intended meaning a sarcastic text contrary what expressed. Further, use code-mix language express increasing day by day. Current techniques data have limited success due different lexicon, syntax, scarcity labeled corpora. To solve joint problem code-mixing detection, we propose idea capturing...
Anomaly detection is an important task in many real world applications such as fraud detection, suspicious activity health care monitoring etc. In this paper, we tackle problem from supervised learning perspective online setting. We maximize well known \emph{Gmean} metric for class-imbalance framework. Specifically, show that maximizing equivalent to minimizing a convex surrogate loss function and based on propose novel algorithm anomaly detection. then show, by extensive experiments, the...
Speech-to-text translation pertains to the task of converting speech signals in a language text another language. It finds its application various domains, such as hands-free communication, dictation, video lecture transcription, and translation, name few. Automatic Speech Recognition (ASR), well Machine Translation(MT) models, play crucial roles traditional ST enabling conversion spoken original form written facilitating seamless cross-lingual communication. ASR recognizes words, while MT...