Sachin Mehta

ORCID: 0000-0002-5420-4725
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • AI in cancer detection
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Topic Modeling
  • Radiomics and Machine Learning in Medical Imaging
  • Natural Language Processing Techniques
  • Digital Imaging for Blood Diseases
  • Medical Image Segmentation Techniques
  • Advanced Steganography and Watermarking Techniques
  • Video Surveillance and Tracking Methods
  • Brain Tumor Detection and Classification
  • Advanced Image and Video Retrieval Techniques
  • COVID-19 diagnosis using AI
  • Digital Media Forensic Detection
  • Chaos-based Image/Signal Encryption
  • Adversarial Robustness in Machine Learning
  • Human Pose and Action Recognition
  • Speech Recognition and Synthesis
  • Image and Signal Denoising Methods
  • Cardiac Arrest and Resuscitation
  • Photovoltaic System Optimization Techniques
  • Solar Radiation and Photovoltaics
  • Advanced Vision and Imaging
  • Cutaneous Melanoma Detection and Management

Duke University Hospital
2023-2025

Apple (United Kingdom)
2023-2024

Duke Medical Center
2023-2024

University of Washington
2016-2023

Harefield Hospital
2022-2023

Allen Institute for Artificial Intelligence
2023

Apple (United States)
2022

Duke University
2022

Seattle University
2020-2022

Allen Institute
2021

Light-weight convolutional neural networks (CNNs) are the de-facto for mobile vision tasks. Their spatial inductive biases allow them to learn representations with fewer parameters across different However, these spatially local. To global representations, self-attention-based trans-formers (ViTs) have been adopted. Unlike CNNs, ViTs heavy-weight. In this paper, we ask following question: is it possible combine strengths of CNNs and build a light-weight low latency network tasks? Towards...

10.48550/arxiv.2110.02178 preprint EN other-oa arXiv (Cornell University) 2021-01-01

We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2, for modeling visual sequential data. Our network uses group point-wise depth-wise dilated separable convolutions to learn representations from large effective receptive field with fewer FLOPs parameters. The performance of our is evaluated on four different tasks: (1) object classification, (2) semantic segmentation, (3) detection, (4) language modeling. Experiments these tasks,...

10.1109/cvpr.2019.00941 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Mobile vision transformers (MobileViT) can achieve state-of-the-art performance across several mobile tasks, including classification and detection. Though these models have fewer parameters, they high latency as compared to convolutional neural network-based models. The main efficiency bottleneck in MobileViT is the multi-headed self-attention (MHA) transformers, which requires $O(k^2)$ time complexity with respect number of tokens (or patches) $k$. Moreover, MHA costly operations (e.g.,...

10.48550/arxiv.2206.02680 preprint EN other-oa arXiv (Cornell University) 2022-01-01

The impact of soiling on solar panels is an important and well-studied problem in renewable energy sector. In this paper, we present the first convolutional neural network (CNN) based approach for panel defect analysis. Our takes RGB image environmental factors as inputs to predict power loss, localization, type. computer vision, localization a complex task which typically requires manually labeled training data such bounding boxes or segmentation masks. proposed consists specialized four...

10.1109/wacv.2018.00043 article EN 2018-03-01

<h3>Importance</h3> Following recent US Food and Drug Administration approval, adoption of whole slide imaging in clinical settings may be imminent, diagnostic accuracy, particularly among challenging breast biopsy specimens, benefit from computerized support tools. <h3>Objective</h3> To develop evaluate computer vision methods to assist pathologists diagnosing the full spectrum samples, benign invasive cancer. <h3>Design, Setting, Participants</h3> In this study, 240 biopsies Breast Cancer...

10.1001/jamanetworkopen.2019.8777 article EN cc-by-nc-nd JAMA Network Open 2019-08-09

In this paper, we introduce an end-to-end machine learning-based system for classifying autism spectrum disorder (ASD) using facial attributes such as expressions, action units, arousal, and valence. Our classifies ASD representations of different from convolutional neural networks, which are trained on images in the wild. experimental results show that used our statistically significant improve sensitivity, specificity, F1 score classification by a large margin. particular, addition...

10.1109/icip.2019.8803604 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2019-08-26

We trained and applied an encoder-decoder model to semantically segment breast biopsy images into biologically meaningful tissue labels. Since conventional encoderdecoder networks cannot be directly on large the different sized structures in biopsies present novel challenges, we propose four modifications: (1) input-aware encoding block compensate for information loss, (2) a new dense connection pattern between encoder decoder, (3) sparse decoders combine multi-level features, (4)...

10.1109/wacv.2018.00078 article EN 2018-03-01

Sanjay Subramanian, Lucy Lu Wang, Ben Bogin, Sachin Mehta, Madeleine van Zuylen, Sravanthi Parasa, Sameer Singh, Matt Gardner, Hannaneh Hajishirzi. Findings of the Association for Computational Linguistics: EMNLP 2020.

10.18653/v1/2020.findings-emnlp.191 article EN cc-by 2020-01-01

We introduce a novel and generic convolutional unit, <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DiCE</monospace> that is built using dimension-wise convolutions fusion. The apply light-weight filtering across each dimension of the input tensor while fusion efficiently combines these representations; allowing unit to encode spatial channel-wise information contained in tensor. simple can be seamlessly integrated with any architecture...

10.1109/tpami.2020.3041871 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-01-01

Diagnosing melanocytic lesions is one of the most challenging areas pathology with extensive intra- and inter-observer variability. The gold standard for a diagnosis invasive melanoma examination histopathological whole slide skin biopsy images by an experienced dermatopathologist. Digitized offer novel opportunities computer programs to improve diagnostic performance pathologists. In order automatically classify such images, representations that reflect content context input are needed....

10.1109/access.2021.3132958 article EN cc-by IEEE Access 2021-01-01

Training large language models (LLMs) for different inference constraints is computationally expensive, limiting control over efficiency-accuracy trade-offs. Moreover, once trained, these typically process tokens uniformly, regardless of their complexity, leading to static and inflexible behavior. In this paper, we introduce a post-training optimization framework, DynaMoE, that adapts pre-trained dense LLM token-difficulty-driven Mixture-of-Experts model with minimal fine-tuning cost. This...

10.48550/arxiv.2502.12325 preprint EN arXiv (Cornell University) 2025-02-17

Large Language Models (LLMs) trained on historical web data inevitably become outdated. We investigate evaluation strategies and update methods for LLMs as new becomes available. introduce a web-scale dataset time-continual pretraining of derived from 114 dumps Common Crawl (CC) - orders magnitude larger than previous continual language modeling benchmarks. also design time-stratified evaluations across both general CC specific domains (Wikipedia, StackExchange, code documentation) to assess...

10.48550/arxiv.2504.02107 preprint EN arXiv (Cornell University) 2025-04-02
Coming Soon ...