Gopinath Chennupati

ORCID: 0000-0002-6223-8570
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Evolutionary Algorithms and Applications
  • Adversarial Robustness in Machine Learning
  • Machine Learning and Data Classification
  • Metaheuristic Optimization Algorithms Research
  • Cloud Computing and Resource Management
  • Protein Structure and Dynamics
  • Speech Recognition and Synthesis
  • Distributed and Parallel Computing Systems
  • Topic Modeling
  • Anomaly Detection Techniques and Applications
  • Machine Learning in Bioinformatics
  • Computational Drug Discovery Methods
  • Tensor decomposition and applications
  • Algorithms and Data Compression
  • Interconnection Networks and Systems
  • Embedded Systems Design Techniques
  • Domain Adaptation and Few-Shot Learning
  • Music and Audio Processing
  • Low-power high-performance VLSI design
  • Computability, Logic, AI Algorithms
  • Artificial Intelligence in Healthcare and Education
  • Explainable Artificial Intelligence (XAI)
  • Natural Language Processing Techniques

Amazon (United States)
2021-2024

Los Alamos National Laboratory
2017-2022

Amazon (Germany)
2021

University of Limerick
2014-2016

As quantum computers become available to the general public, need has arisen train a cohort of programmers, many whom have been developing classical computer programs for most their careers. While currently less than 100 qubits, computing hardware is widely expected grow in terms qubit count, quality, and connectivity. This review aims explain principles programming, which are quite different from with straightforward algebra that makes understanding underlying fascinating mechanical...

10.1145/3517340 article EN ACM Transactions on Quantum Computing 2022-03-28

We introduce a novel method to combat label noise when training deep neural networks for classification. propose loss function that permits abstention during thereby allowing the DNN abstain on confusing samples while continuing learn and improve classification performance non-abstained samples. show how such abstaining classifier (DAC) can be used robust learning in presence of different types noise. In case structured or systematic -- where noisy labels examples are correlated with...

10.48550/arxiv.1905.10964 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Topic modeling, or identifying the set of topics that occur in a collection articles, is one primary objectives text mining. Typically, corpus represented as words-by-documents matrix, X, where xij , encodes i-th word importance score j-th document using Term Frequency-Inverse Document Frequency (TF-IDF) representation. Non-negative Matrix Factorization (NMF) can then be used order to extract and model corpus. NMF approximates X product two low-rank non-negative factors:W, which represents...

10.1109/access.2021.3106879 article EN cc-by IEEE Access 2021-01-01

Performance modeling is a challenging problem due to the complexities of hardware architectures. In this paper, we present PPT-GPU, scalable and accurate simulation framework that enables GPU code developers architects predict performance applications in fast, manner on different PPT-GPU part open source project, Prediction Toolkit (PPT) developed at Los Alamos National Laboratory. We extend old model PPT runtimes computational physics codes offer better prediction accuracy, for which, add...

10.1109/lca.2019.2904497 article EN publisher-specific-oa IEEE Computer Architecture Letters 2019-01-01

GPUs are prevalent in modern computing systems at all scales. They consume a significant fraction of the energy these systems. However, vendors do not publish actual cost power/energy overhead their internal microarchitecture. In this paper, we accurately measure consumption various PTX instructions found NVIDIA GPUs. We provide an exhaustive comparison more than 40 for four high-end from different generations (Maxwell, Pascal, Volta, and Turing). Furthermore, show effect CUDA compiler...

10.1145/3387902.3392613 preprint EN 2020-05-11

We propose SiCaGCN, a learning system to predict the similarity of given software code set codes that are permitted run on computational resource, such as supercomputer or cloud server. This characterization allows us detect abusive codes. Our relies structural analysis control-flow graph and two different measures: Graph Edit Distance (GED) singular values based metric. SiCaGCN combines elements Convolutional Neural Networks (GCN), Capsule networks, attention mechanism, neural tensor...

10.1109/access.2020.3011909 article EN cc-by IEEE Access 2020-01-01

The last decade has seen a shift in the computer systems industry where heterogeneous computing become prevalent. Graphics Processing Units (GPUs) are now present supercomputers to mobile phones and tablets. GPUs used for graphics operations as well general-purpose (GPGPUs) boost performance of compute-intensive applications. However, percentage undisclosed characteristics beyond what vendors provide is not small. In this paper, we introduce very low overhead portable analysis exposing...

10.1109/hpec.2019.8916466 article EN 2019-09-01

In this paper, we introduce an accurate and scalable memory modeling framework for General Purpose Graphics Processor units (GPGPUs), PPT-GPU-Mem. That is Performance Prediction Tool-Kit GPUs Cache Memories. PPT-GPU-Mem predicts the performance of different GPUs' cache hierarchy (L1 & L2) based on reuse profiles. We extract a trace each GPU kernel once in its lifetime using recently released binary instrumentation tool, NVBIT. The extraction architecture-independent can be done any available...

10.1145/3392717.3392761 article EN 2020-06-29

In this paper, we present PPT-GPU, a scalable performance prediction toolkit for GPUs. PPT-GPU achieves scalability through hybrid high-level modeling approach where some computations are extrapolated and multiple parts of the model parallelized. The tool primary models use pre-collected memory instructions traces workloads to accurately capture dynamic behavior kernels.

10.1145/3458817.3476221 article EN 2021-10-21

Parallel application performance models provide valuable insight about the in real systems. Capable tools providing fast, accurate, and comprehensive prediction evaluation of high-performance computing (HPC) applications system architectures have important value. This paper presents PyPassT, an analysis based modeling framework built on static program integrated simulation target HPC architectures. More specifically, analyzes source code written C with OpenACC directives transforms it into...

10.1145/3200921.3200937 article EN 2018-05-14

Although Evolutionary Computation (EC) has been used with considerable success to evolve computer programs, the majority of this work targeted production serial code. Recent Grammatical Evolution (GE) produced Multi-core (MCGE-II), a system that natively produces parallel code, including ability execute recursive calls in parallel.

10.1145/2739480.2754746 article EN 2015-07-07

Refraining from confidently predicting when faced with categories of inputs different those seen during training is an important requirement for the safe deployment deep learning systems. While simple to state, this has been a particularly challenging problem in learning, where models often end up making overconfident predictions such situations. In work, we present simple, but highly effective approach deal out-of-distribution detection that uses principle abstention: encountering sample...

10.1109/icmla52953.2021.00050 article EN 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) 2021-12-01

Distinguishing malicious anomalous activities from unusual but benign is a fundamental challenge for cyber defenders. Prior studies have shown that statistical user behavior analysis yields accurate detections by learning profiles observed activity. These unsupervised models are able to generalize unseen types of attacks detecting deviations normal without knowledge specific attack signatures. However, approaches proposed date based on probabilistic matrix factorization limited the...

10.1145/3519602 article EN Digital Threats Research and Practice 2022-04-12

We present the Analytical Memory Model with Pipelines (AMMP) of Performance Prediction Toolkit (PPT). PPT-AMMP takes high-level source code and hardware architecture parameters as input, predicts runtime that on target platform, which is defined in input parameters. transforms to an (architecture-independent) intermediate representation, then (i) analyzes basic block structure code, (ii) processes architecture-independent virtual memory access patterns it uses build reuse distance...

10.1145/3316480.3325518 article EN public-domain 2019-05-29

Non-negative Matrix Factorization (NMF) models the topics of a text corpus by decomposing matrix term frequency-inverse document frequency (TF-IDF) representation, X, into two low-rank non-negative matrices: W , representing and H, mapping documents onto space topics. One challenge, common to all topic models, is determination number latent (aka model determination). Determining correct important: underestimating results in poor separation, under-fitting, while overestimating leads noisy...

10.1109/icmla51294.2020.00060 article EN 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) 2020-12-01

As the US Department of Energy (DOE) invests in exascale computing, performance modeling physics codes on CPUs remain a challenge computational co-design due to complex design processors that include memory hierarchies, instruction pipelining, and speculative execution. We present Analytical Memory Model (AMM), model cache hierarchy, embedded Performance Prediction Toolkit (PPT) — suite discrete-event-simulation-based codesign hardware software models. AMM enables PPT significantly improve...

10.5555/3242181.3242251 article EN Winter Simulation Conference 2017-12-03

Automatic speech recognition (ASR) models with low-footprint are increasingly being deployed on edge devices for conversational agents, which enhances privacy. We study the problem of federated continual incremental learning recurrent neural network-transducer (RNN-T) ASR in privacy-enhancing scheme on-device, without access to ground truth human transcripts or machine transcriptions from a stronger model. In particular, we performance self-learning based scheme, paired teacher model updated...

10.1109/icassp49357.2023.10096983 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Modern Automatic Speech Recognition (ASR) systems are evaluated with respect to Word Error Rate (WER). While WER is a useful metric for training and evaluation of speech models, it does not fully reflect the difference in semantics between predicted ground truth transcriptions. In conversational voice assistants, ability sufficiently understand semantic meaning user request often more important than recognizing all words correctly. this work, we propose system that can determine, high degree...

10.1109/icassp48485.2024.10448230 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

We describe the utilization of on-chip multiple CPU architectures to automatically evolve parallel computer programs. These programs have capability exploiting computational efficiency modern multi-core machines.

10.1145/2598394.2605670 article EN 2014-07-11

The era of exascale computing opens new venues for innovations and discoveries in many scientific, engineering, commercial fields. However, with the exaflops also come extra-large high-dimensional data generated by highperformance computing. High-dimensional is presented as multidimensional arrays, aka tensors. presence latent (not directly observable) structures tensor allows a unique representation compression classical factorization techniques. methods are not always stable or they can be...

10.1109/hpec43674.2020.9286234 article EN 2020-09-22
Coming Soon ...