NFDI4DS | UHH-SEMS - Publication Details

Sanjeev Arora

ORCID: 0000-0003-4852-691X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5103209777

Research Areas

Complexity and Algorithms in Graphs
Hepatitis C virus research
Advanced Graph Theory Research
Machine Learning and Algorithms
Stochastic Gradient Optimization Techniques
Topic Modeling
Liver Disease Diagnosis and Treatment
Primary Care and Health Outcomes
Adversarial Robustness in Machine Learning
Natural Language Processing Techniques
Hepatitis B Virus Studies
Neural Networks and Applications
Computational Geometry and Mesh Generation
Optimization and Search Problems
Domain Adaptation and Few-Shot Learning
Advanced Neural Network Applications
Sparse and Compressive Sensing Techniques
Chronic Disease Management Strategies
Cryptography and Data Security
Diabetes Management and Education
Telemedicine and Telehealth Implementation
Generative Adversarial Networks and Image Synthesis
Privacy-Preserving Technologies in Data
Interprofessional Education and Collaboration
Blind Source Separation Techniques

Government of Karnataka
2025

University of New Mexico
2015-2024

International Institute of Information Technology Bangalore
2024

Princeton University
2013-2024

Agrivet Research and Advisory Pvt Ltd (India)
2024

New York University
2008-2023

Virginia Tech
2023

University School
2023

Radiology Associates of Albuquerque
2021-2023

National Institutes of Health
2023

Proof verification and the hardness of approximation problems

OPENALEX - Publications

Sanjeev Arora Carsten Lund Rajeev Motwani Madhu Sudan Márió Szegedy

We show that every language in NP has a probablistic verifier checks membership proofs for it using logarithmic number of random bits and by examining constant the proof. If string is language, then there exists proof such accepts with probability 1 (i.e., choice its string). For strings not rejects provided “proof” at least 1/2. Our result builds upon improves recent Arora Safra [1998] whose verifiers examine nonconstant (though this very slowly growing function input length). As...

10.1145/278298.278306 article EN Journal of the ACM 1998-05-01

Probabilistic checking of proofs

OPENALEX - Publications

Sanjeev Arora Muli Safra

We give a new characterization of NP: the class NP contains exactly those languages L for which membership proofs (a proof that an input x is in ) can be verified probabilistically polynomial time using logarithmic number random bits and by reading sublogarithmic from proof. discuss implications this characterization; specifically, we show approximating Clique Independent Set, even very weak sense, NP-hard.

10.1145/273865.273901 article EN Journal of the ACM 1998-01-01

Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems

OPENALEX - Publications

Sanjeev Arora

We present a polynomial time approximation scheme for Euclidean TSP in fixed dimensions. For every c > 1 and given any n nodes ℛ 2 , randomized version of the finds (1 + 1/ )-approximation to optimum traveling salesman tour O(n (log ) O(c) time. When are d running increases (O(√ c)) d-1 ). c, is · poly(log ), that nearly linear . The algorithmm can be derandomized, but this by factor previous best algorithm problem (due Christofides) achieves 3/2-aproximation also give similar schemes...

10.1145/290179.290180 article EN Journal of the ACM 1998-09-01

Outcomes of Treatment for Hepatitis C Virus Infection by Primary Care Providers

OPENALEX - Publications

Sanjeev Arora Karla Thornton Glen H. Murata Paulina Deming Summers Kalishman and 10 more

The Extension for Community Healthcare Outcomes (ECHO) model was developed to improve access care underserved populations with complex health problems such as hepatitis C virus (HCV) infection. With the use of video-conferencing technology, ECHO program trains primary providers treat diseases.We conducted a prospective cohort study comparing treatment HCV infection at University New Mexico (UNM) clinic by clinicians 21 sites in rural areas and prisons Mexico. A total 407 patients chronic who...

10.1056/nejmoa1009370 article EN New England Journal of Medicine 2011-06-01

OPENALEX - Publications

Sanjeev Arora Elad Hazan Satyen Kale

Algorithms in varied fields use the idea of maintaining a distribution over certain set and multiplicative update rule to iteratively change these weights.Their analyses are usually very similar rely on an exponential potential function.In this survey we present simple meta-algorithm that unifies many disparate algorithms derives them as instantiations meta-algorithm.We feel since its analysis so simple, applications broad, it should be standard part courses, like "divide conquer."

10.4086/toc.2012.v008a006 article EN cc-by Theory of Computing 2012-01-01

Proof verification and hardness of approximation problems

OPENALEX - Publications

Sanjeev Arora Carsten Lund R. Motwani Madhu Sudan Márió Szegedy

The class PCP(f(n),g(n)) consists of all languages L for which there exists a polynomial-time probabilistic oracle machine that used O(f(n)) random bits, queries O(g(n)) bits its and behaves as follows: If x in then an y such the accepts choices but if not every rejects with high probability. Arora Safra (1992) characterized NP PCP(log n, (loglogn)/sup O(1)/). authors improve on their result by showing NP=PCP(logn, 1). has following consequences: (1) MAXSNP-hard problems (e.g. metric TSP,...

10.1109/sfcs.1992.267823 article EN 1992-01-01

Polynomial time approximation schemes for Euclidean TSP and other geometric problems

OPENALEX - Publications

Sanjeev Arora

We present a polynomial time approximation scheme for Euclidean TSP in /spl Rfr//sup 2/. Given any n nodes the plane and epsiv/>0, finds (1+/spl epsiv/)-approximation to optimum traveling salesman tour n/sup 0(1//spl epsiv/)/. When are d/, running increases n(O/spl tilde/(log/sup d-2/n)//spl epsiv//sup d-1/) The previous best algorithm problem (due Christofides (1976)) achieves 3/2-approximation time. also give similar schemes host of other problems, including Steiner Tree, k-TSP, Minimum...

10.1109/sfcs.1996.548458 article EN 2002-12-23

Expander flows, geometric embeddings and graph partitioning

OPENALEX - Publications

Sanjeev Arora Satish Rao Umesh Vazirani

We give a O (√log n )-approximation algorithm for the sparsest cut, edge expansion, balanced separator, and graph conductance problems. This improves (log of Leighton Rao (1988). use well-known semidefinite relaxation with triangle inequality constraints. Central to our analysis is geometric theorem about projections point sets in R d , whose proof makes essential phenomenon called measure concentration. also describe an interesting natural “approximate certificate” graph's which involves...

10.1145/1502793.1502794 article EN Journal of the ACM 2009-04-01

The shocking cost of turnover in health care

OPENALEX - Publications

J. Deane Waldman Frank Kelly Sanjeev Arora Howard L. Smith

Review of turnover costs at a major medical center helps health care managers gain insights about the magnitude and determinants this managerial challenge assess implications for organizational effectiveness. Here, includes hiring, training, productivity loss costs. Minimum cost represented >5 percent total annual operating budget. Editor's Note: This article is being reprinted with permission from Health Care Management 29(1), 2-7. Copyright © 2009 Lippincott Williams & Wilkins.

10.1097/hmr.0b013e3181e3940e article EN Health Care Management Review 2010-07-01

Expanding Access to Hepatitis C Virus Treatment—Extension for Community Healthcare Outcomes (ECHO) Project: Disruptive Innovation in Specialty Care†

OPENALEX - Publications

Sanjeev Arora Summers Kalishman Karla Thornton Denise Dion Glen H. Murata and 11 more

Abstract The Extension for Community Healthcare Outcomes (ECHO) Model was developed by the University of New Mexico Health Sciences Center as a platform to deliver complex specialty medical care underserved populations through an innovative educational model team-based interdisciplinary development. Using state-of-the-art telehealth technology, best practice protocols, and case-based learning, ECHO trains supports primary providers develop knowledge self-efficacy on variety diseases. As...

10.1002/hep.23802 article EN Hepatology 2010-06-11

Stronger generalization bounds for deep nets via a compression approach

OPENALEX - Publications

Sanjeev Arora Rong Ge Behnam Neyshabur Yi Zhang

Deep nets generalize well despite having more parameters than the number of training samples. Recent works try to give an explanation using PAC-Bayes and Margin-based analyses, but do not as yet result in sample complexity bounds better naive parameter counting. The current paper shows generalization that're orders magnitude practice. These rely upon new succinct reparametrizations trained net --- a compression that is explicit efficient. yield via simple compression-based framework...

10.48550/arxiv.1802.05296 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Generalization and Equilibrium in Generative Adversarial Nets (GANs)

OPENALEX - Publications

Sanjeev Arora Rong Ge Yingyu Liang Tengyu Ma Yi Zhang

We show that training of generative adversarial network (GAN) may not have good generalization properties; e.g., appear successful but the trained distribution be far from target in standard metrics. However, does occur for a weaker metric called neural net distance. It is also shown an approximate pure equilibrium exists discriminator/generator game special class generators with natural objectives when generator capacity and set sizes are moderate. This existence inspires MIX+GAN protocol,...

10.48550/arxiv.1703.00573 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Learning Topic Models -- Going beyond SVD

OPENALEX - Publications

Sanjeev Arora Rong Ge Ankur Moitra

Topic Modeling is an approach used for automatic comprehension and classification of data in a variety settings, perhaps the canonical application uncovering thematic structure corpus documents. A number foundational works both machine learning theory have suggested probabilistic model documents, whereby documents arise as convex combination (i.e. distribution on) small topic vectors, each vector being on words word-frequencies). Similar models since been areas, Latent Dirichlet Allocation...

10.1109/focs.2012.49 article EN 2012-10-01

Computing a nonnegative matrix factorization -- provably

OPENALEX - Publications

Sanjeev Arora Rong Ge Ravindran Kannan Ankur Moitra

The Nonnegative Matrix Factorization (NMF) problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF become enormously popular in machine learning, where factorization is computed using variety of local search heuristics. Vavasis recently proved that this NP-complete. We initiate study when solvable polynomial time. Consider nonnegative m x n matrix $M$...

10.1145/2213977.2213994 article EN 2012-05-19

A Latent Variable Model Approach to PMI-based Word Embeddings

OPENALEX - Publications

Sanjeev Arora Yuanzhi Li Yingyu Liang Tengyu Ma Andrej Risteski

Semantic word embeddings represent the meaning of a via vector, and are created by diverse methods. Many use nonlinear operations on co-occurrence statistics, have hand-tuned hyperparameters reweighting This paper proposes new generative model, dynamic version log-linear topic model Mnih Hinton (2007). The methodological novelty is to prior compute closed form expressions for statistics. provides theoretical justification models like PMI, word2vec, GloVe, as well some hyperparameter choices....

10.1162/tacl_a_00106 article EN cc-by Transactions of the Association for Computational Linguistics 2016-12-01

A Practical Algorithm for Topic Modeling with Provable Guarantees

OPENALEX - Publications

Sanjeev Arora Rong Ge Yonatan Halpern David Mimno Ankur Moitra and 3 more

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on maximum likelihood objective. Efficient algorithms exist that approximate this objective, but they no provable guarantees. Recently, introduced bounds, these are not practical because inefficient robust violations of assumptions. In paper we present an algorithm is both practical. The produces results comparable the...

10.48550/arxiv.1212.4777 preprint EN other-oa arXiv (Cornell University) 2012-01-01

A Theoretical Analysis of Contrastive Unsupervised Representation Learning

OPENALEX - Publications

Sanjeev Arora Hrishikesh Khandeparkar Mikhail Khodak Orestis Plevrakis Nikunj Saunshi

Recent empirical works have successfully used unlabeled data to learn feature representations that are broadly useful in downstream classification tasks. Several of these methods reminiscent the well-known word2vec embedding algorithm: leveraging availability pairs semantically "similar" points and "negative samples," learner forces inner product similar with each other be higher on average than negative samples. The current paper uses term contrastive learning for such algorithms presents a...

10.48550/arxiv.1902.09229 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Partnering Urban Academic Medical Centers And Rural Primary Care Clinicians To Provide Complex Chronic Disease Care

OPENALEX - Publications

Sanjeev Arora Summers Kalishman Denise Dion Dara Som Karla Thornton and 10 more

Many of the estimated thirty-two million Americans expected to gain coverage under Affordable Care Act are likely have high levels unmet need because various chronic illnesses and live in areas that already underserved. In New Mexico an innovative new model health care education delivery known as Project ECHO (Extension for Community Healthcare Outcomes) provides high-quality primary specialty a comparable population. Using state-of-the-art telehealth technology case-based learning, enables...

10.1377/hlthaff.2011.0278 article EN Health Affairs 2011-05-20

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks

OPENALEX - Publications

Sanjeev Arora Simon S. Du Wei Hu Zhiyuan Li Ruosong Wang

Recent works have cast some light on the mystery of why deep nets fit any data and generalize despite being very overparametrized. This paper analyzes training generalization for a simple 2-layer ReLU net with random initialization, provides following improvements over recent works: (i) Using tighter characterization speed than papers, an explanation neural labels leads to slower training, as originally observed in [Zhang et al. ICLR'17]. (ii) Generalization bound independent network size,...

10.48550/arxiv.1901.08584 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Provable Bounds for Learning Some Deep Representations

OPENALEX - Publications

Sanjeev Arora Aditya Bhaskara Rong Ge Tengyu Ma

We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our is an $n$ node multilayer neural net has degree at most $n^{\gamma}$ for some $\gamma <1$ each edge random weight $[-1,1]$. algorithm learns {\em almost all} networks this polynomial running time. The sample complexity quadratic or cubic depending upon details model. uses layerwise learning. It based novel idea observing correlations among features...

10.48550/arxiv.1310.6343 preprint EN other-oa arXiv (Cornell University) 2013-01-01

Project ECHO (Extension for Community Healthcare Outcomes): A New Model for Educating Primary Care Providers about Treatment of Substance Use Disorders

OPENALEX - Publications

Miriam Komaromy Dan Duhigg Adam Metcalf Cristina Carlson Summers Kalishman and 4 more

Background Project ECHO (Extension for Community Healthcare Outcomes) trains and mentors primary care providers (PCPs) in the of patients with complex conditions. is a distance education model that connects specialists numerous PCPs via simultaneous video link purpose facilitating case-based learning. This article describes teleECHO clinic based at University New Mexico Health Sciences Center focused on treatment substance use disorders (SUDs) behavioral health disorders. Methods Since 2005,...

10.1080/08897077.2015.1129388 article EN Substance Abuse 2016-01-01

Coming Soon ...