NFDI4DS | UHH-SEMS - Publication Details

Inderjit S. Dhillon

ORCID: 0000-0002-2759-1416

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5063459703

Research Areas

Sparse and Compressive Sensing Techniques
Face and Expression Recognition
Text and Document Classification Technologies
Stochastic Gradient Optimization Techniques
Complex Network Analysis Techniques
Machine Learning and Algorithms
Advanced Clustering Algorithms Research
Advanced Graph Neural Networks
Topic Modeling
Matrix Theory and Algorithms
Machine Learning and Data Classification
Domain Adaptation and Few-Shot Learning
Advanced Optimization Algorithms Research
Machine Learning and ELM
Bayesian Methods and Mixture Models
Gene expression and cancer classification
Data Management and Algorithms
Statistical Methods and Inference
Tensor decomposition and applications
Neural Networks and Applications
Natural Language Processing Techniques
Blind Source Separation Techniques
Anomaly Detection Techniques and Applications
Bioinformatics and Genomic Networks
Recommender Systems and Techniques

Amazon (United States)
2019-2024

Google (United States)
2023-2024

The University of Texas at Austin
2014-2023

Amazon (Germany)
2017-2022

Search
2022

National Taiwan University
2014

Max Planck Society
2010

IBM Research - Almaden
1998-2000

University of California, Berkeley
1996-1997

University of Tennessee at Knoxville
1996

Information-theoretic metric learning

OPENALEX - Publications

Jason V. Davis Brian Kulis Prateek Jain Suvrit Sra Inderjit S. Dhillon

In this paper, we present an information-theoretic approach to learning a Mahalanobis distance function. We formulate the problem as that of minimizing differential relative entropy between two multivariate Gaussians under constraints on express particular Bregman optimization problem---that LogDet divergence subject linear constraints. Our resulting algorithm has several advantages over existing methods. First, our method can handle wide variety and optionally incorporate prior Second, it...

10.1145/1273496.1273523 article EN 2007-06-20

Co-clustering documents and words using bipartite spectral graph partitioning

OPENALEX - Publications

Inderjit S. Dhillon

Both document clustering and word are well studied problems. Most existing algorithms cluster documents words separately but not simultaneously. In this paper we present the novel idea of modeling collection as a bipartite graph between words, using which simultaneous problem can be posed partitioning problem. To solve problem, use new spectral co-clustering algorithm that uses second left right singular vectors an appropriately scaled word-document matrix to yield good bipartitionings. The...

10.1145/502512.502550 article EN 2001-08-26

Clustering with Bregman Divergences

OPENALEX - Publications

Arindam Banerjee Srujana Merugu Inderjit S. Dhillon Joydeep Ghosh

A wide variety of distortion functions are used for clustering, e.g., squared Euclidean distance, Mahalanobis distance and relative entropy. In this paper, we propose analyze parametric hard soft clustering algorithms based on a large class known as Bregman divergences. The proposed unify centroid-based approaches, such classical kmeans information-theoretic which arise by special choices the divergence. maintain simplicity scalability algorithm, while generalizing basic idea to very loss...

10.1137/1.9781611972740.22 article EN 2004-04-22

OPENALEX - Publications

Inderjit S. Dhillon Dharmendra S. Modha

10.1023/a:1007612920971 article EN Machine Learning 2001-01-01

Kernel k-means

OPENALEX - Publications

Inderjit S. Dhillon Yuqiang Guan Brian Kulis

Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods remained only loosely related. In this paper, we give an explicit theoretical connection between them. We show the generality of weighted kernel objective function, derive normalized cut as a special case. Given positive definite similarity matrix, our results lead novel algorithm monotonically decreases cut. This has...

10.1145/1014052.1014118 article EN 2004-08-22

Information-theoretic co-clustering

OPENALEX - Publications

Inderjit S. Dhillon Subramanyam Mallela Dharmendra S. Modha

Two-dimensional contingency or co-occurrence tables arise frequently in important applications such as text, web-log and market-basket data analysis. A basic problem table analysis is co-clustering: simultaneous clustering of the rows columns. novel theoretical formulation views an empirical joint probability distribution two discrete random variables poses co-clustering optimization information theory---the optimal maximizes mutual between clustered subject to constraints on number row...

10.1145/956750.956764 article EN 2003-08-24

Weighted Graph Cuts without Eigenvectors A Multilevel Approach

OPENALEX - Publications

Inderjit S. Dhillon Yuqiang Guan Brian Kulis

A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral and kernel k-means are two the main methods. In this paper, we discuss an equivalence between objective functions used in these seemingly different methods--in particular, a general weighted mathematically equivalent graph objective. We exploit develop fast, high-quality multilevel algorithm directly optimizes various objectives, such as popular ratio cut, normalized...

10.1109/tpami.2007.1115 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2007-09-17

Clustering on the Unit Hypersphere using von Mises-Fisher Distributions

OPENALEX - Publications

Arindam Banerjee Inderjit S. Dhillon Joydeep Ghosh Suvrit Sra

Several large scale data mining applications, such as text categorization and gene expression analysis, involve high-dimensional that is also inherently directional in nature. Often L2 normalized so it lies on the surface of a unit hypersphere. Popular models (mixtures of) multi-variate Gaussians are inadequate for characterizing data. This paper proposes generative mixture-model approach to clustering based von Mises-Fisher (vMF) distribution, which arises naturally distributed In...

10.5555/1046920.1088718 article EN Journal of Machine Learning Research 2005-12-01

A divisive information theoretic feature clustering algorithm for text classification

OPENALEX - Publications

Inderjit S. Dhillon Subramanyam Mallela Rahul Kumar

High dimensionality of text can be a deterrent in applying complex learners such as Support Vector Machines to the task classification. Feature clustering is powerful alternative feature selection for reducing data. In this paper we propose new information-theoretic divisive algorithm feature/word and apply it Existing techniques distributional words are agglomerative nature result (i) sub-optimal word clusters (ii) high computational cost. order explicitly capture optimality an information...

10.5555/944919.944973 article EN Journal of Machine Learning Research 2003-03-01

Designing structured tight frames via an alternating projection method

OPENALEX - Publications

Joel A. Tropp Inderjit S. Dhillon Robert W. Heath Thomas Strohmer

Tight frames, also known as general Welch-bound- equality sequences, generalize orthonormal systems. Numerous applications - including communications, coding, and sparse approximation- require finite-dimensional tight frames that possess additional structural properties. This paper proposes an alternating projection method is versatile enough to solve a huge class of inverse eigenvalue problems (IEPs), which includes the frame design problem. To apply this method, one needs only matrix...

10.1109/tit.2004.839492 article EN IEEE Transactions on Information Theory 2005-01-01

Semi-supervised graph clustering: a kernel approach

OPENALEX - Publications

Brian Kulis Sugato Basu Inderjit S. Dhillon Raymond J. Mooney

10.1007/s10994-008-5084-4 article EN Machine Learning 2008-09-23

Towards Fast Computation of Certified Robustness for ReLU Networks

OPENALEX - Publications

Tsui-Wei Weng Huan Zhang Hongge Chen Zhao Song Cho‐Jui Hsieh and 3 more

Verifying the robustness property of a general Rectified Linear Unit (ReLU) network is an NP-complete problem [Katz, Barrett, Dill, Julian and Kochenderfer CAV17]. Although finding exact minimum adversarial distortion hard, giving certified lower bound possible. Current available methods computing such are either time-consuming or delivering low quality bounds that too loose to be useful. In this paper, we exploit special structure ReLU networks provide two computationally efficient...

10.48550/arxiv.1804.09699 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Clustering with Multiple Graphs

OPENALEX - Publications

Wei Tang Zhengdong Lu Inderjit S. Dhillon

In graph-based learning models, entities are often represented as vertices in an undirected graph with weighted edges describing the relationships between entities. many real-world applications, however, associated relations of different types and/or from sources, which can be well captured by multiple graphs over same set vertices. How to exploit such sources information make better inferences on remains interesting open problem. this paper, we focus problem clustering based both...

10.1109/icdm.2009.125 article EN 2009-12-01

Inductive matrix completion for predicting gene–disease associations

OPENALEX - Publications

Nagarajan Natarajan Inderjit S. Dhillon

Most existing methods for predicting causal disease genes rely on specific type of evidence, and are therefore limited in terms applicability. More often than not, the evidence available diseases varies-for example, we may know linked genes, keywords associated with obtained by mining text, or co-occurrence symptoms patients. Similarly, microarray probes convey information only certain sets genes. In this article, apply a novel matrix-completion method called Inductive Matrix Completion to...

10.1093/bioinformatics/btu269 article EN cc-by-nc Bioinformatics 2014-06-11

A Generalized Maximum Entropy Approach to Bregman Co-clustering and Matrix Approximation

OPENALEX - Publications

Arindam Banerjee Inderjit S. Dhillon Joydeep Ghosh Srujana Merugu Dharmendra S. Modha

Co-clustering, or simultaneous clustering of rows and columns a two-dimensional data matrix, is rapidly becoming powerful analysis technique. Co-clustering has enjoyed wide success in varied application domains such as text clustering, gene-microarray analysis, natural language processing image, speech video analysis. In this paper, we introduce partitional co-clustering formulation that driven by the search for good matrix approximation---every associated with an approximation original...

10.5555/1314498.1314563 article EN Journal of Machine Learning Research 2007-12-01

Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems

OPENALEX - Publications

Hsiang‐Fu Yu Cho‐Jui Hsieh Si Si Inderjit S. Dhillon

Matrix factorization, when the matrix has missing values, become one of leading techniques for recommender systems. To handle web-scale datasets with millions users and billions ratings, scalability becomes an important issue. Alternating Least Squares (ALS) Stochastic Gradient Descent (SGD) are two popular approaches to compute factorization. There been a recent flurry activity parallelize these algorithms. However, due cubic time complexity in target rank, ALS is not scalable large-scale...

10.1109/icdm.2012.168 article EN 2012-12-01

Coming Soon ...