Sai Mitheran

ORCID: 0000-0003-1011-9761
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Graph Neural Networks
  • Music and Audio Processing
  • Machine Learning in Healthcare
  • Recommender Systems and Techniques
  • Speech Recognition and Synthesis
  • 3D Shape Modeling and Analysis
  • Radio Wave Propagation Studies
  • Neural Networks and Applications
  • Multimodal Machine Learning Applications
  • Surgical Simulation and Training
  • Computer Graphics and Visualization Techniques
  • Advanced Image Processing Techniques
  • Machine Learning and Algorithms
  • Image Enhancement Techniques
  • Domain Adaptation and Few-Shot Learning
  • Power Systems and Technologies
  • Advanced Image Fusion Techniques
  • 3D Surveying and Cultural Heritage
  • Topic Modeling
  • Machine Learning and Data Classification
  • Advanced Neural Network Applications
  • Experimental Learning in Engineering
  • Robotics and Sensor-Based Localization
  • Anatomy and Medical Technology
  • Speech and Audio Processing

National Institute of Technology Tiruchirappalli
2021-2022

Global and local relational reasoning enable scene understanding models to perform human-like analysis understanding. Scene enables better semantic segmentation object-to-object interaction detection. In the medical domain, a robust surgical model allows automation of skill evaluation, real-time monitoring surgeon's performance post-surgical analysis. This paper introduces globally-reasoned multi-task capable performing instrument tool-tissue Here, we incorporate global in latent space...

10.1109/lra.2022.3146544 article EN IEEE Robotics and Automation Letters 2022-01-27

Several approaches have been introduced to understand surgical scenes through downstream tasks like captioning and scene graph generation. However, most of them heavily rely on an independent object detector region-based feature extractor. Encompassing computationally expensive detection extraction models, these multi-stage methods suffer from slow inference speed, making less suitable for real-time applications. The performance the also degrades inheriting errors earlier modules pipeline....

10.1109/lra.2022.3221310 article EN IEEE Robotics and Automation Letters 2022-10-01

Transformers have seen an unprecedented rise in Natural Language Processing and Computer Vision tasks. However, audio tasks, they are either infeasible to train due extremely large sequence length of waveforms or incur a performance penalty when trained on Fourier-based features. In this work, we introduce architecture, Audiomer, where combine 1D Residual Networks with Performer Attention achieve state-of-the-art keyword spotting raw waveforms, outperforming all previous methods while being...

10.48550/arxiv.2109.10252 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Waveguides have been an amusing yet daunting subject for any student studying microwave theory. The &#x201C;structure that guides waves&#x201D; comes in both rectangular and cylindrical shapes. Finding out the equation of electric magnetic fields inside a waveguide is well-studied one, often requiring expertise electromagnetic (EM) theory (see &#x201C;Field Equations (for Plot Generation)&#x201D;]. These studies result many valid solutions, each which called <i>mode</i>. Visualizing...

10.1109/mmm.2022.3148150 article EN IEEE Microwave Magazine 2022-04-04

Session-based recommendation systems suggest relevant items to users by modeling user behavior and preferences using short-term anonymous sessions. Existing methods leverage Graph Neural Networks (GNNs) that propagate aggregate information from neighboring nodes i.e., local message passing. Such graph-based architectures have representational limits, as a single sub-graph is susceptible overfit the sequential dependencies instead of accounting for complex transitions between in different We...

10.48550/arxiv.2107.01516 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01

The Lottery Ticket Hypothesis (LTH) states that for a reasonably sized neural network, sub-network within the same network yields no less performance than dense counterpart when trained from initialization. This work investigates relation between model size and ease of finding these sparse sub-networks. We show through experiments that, surprisingly, under finite budget, smaller models benefit more Search (TS).

10.48550/arxiv.2206.08175 preprint EN cc-by-sa arXiv (Cornell University) 2022-01-01

Single-image haze removal is a long-standing hurdle for computer vision applications. Several works have been focused on transferring advances from image classification, detection, and segmentation to the niche of dehazing, primarily focusing contrastive learning knowledge distillation. However, these approaches prove computationally expensive, raising concern regarding their applicability on-the-edge use-cases. This work introduces simple, lightweight, efficient framework single-image...

10.48550/arxiv.2207.11250 preprint EN cc-by arXiv (Cornell University) 2022-01-01
Coming Soon ...