Raghu Ramakrishnan

ORCID: 0000-0003-1704-8644
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Data Management and Algorithms
  • Advanced Database Systems and Queries
  • Data Mining Algorithms and Applications
  • Logic, Reasoning, and Knowledge
  • Logic, programming, and type systems
  • Data Stream Mining Techniques
  • Semantic Web and Ontologies
  • Formal Methods in Verification
  • Distributed systems and fault tolerance
  • Web Data Mining and Analysis
  • Software System Performance and Reliability
  • Advanced Clustering Algorithms Research
  • Bayesian Modeling and Causal Inference
  • Peer-to-Peer Network Technologies
  • Cloud Computing and Resource Management
  • Service-Oriented Architecture and Web Services
  • Advanced Data Storage Technologies
  • Multimedia Learning Systems
  • Decision Support System Applications
  • Algorithms and Data Compression
  • Data Visualization and Analytics
  • Image Retrieval and Classification Techniques
  • Edcuational Technology Systems
  • Data Quality and Management
  • Internet Traffic Analysis and Secure E-voting

Microsoft (United States)
2024

Sri Manakula Vinayagar Medical College and Hospital
2023

Tata Consultancy Services (India)
2017-2021

Microsoft Research (United Kingdom)
2021

Guru Gobind Singh Indraprastha University
2017-2020

Yahoo (United States)
2006-2012

Yahoo (United Kingdom)
2008-2012

University of Wisconsin–Madison
1997-2010

Yahoo (Spain)
2007-2008

University of Virginia
1999

Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely studied problems this area is identification clusters, or densely populated regions, a multi-dimensional dataset. Prior work does not adequately address problem minimization I/O costs.This paper presents data clustering method named BIRCH (Balanced Iterative Reducing Clustering using Hierarchies), demonstrates that it especially suitable for very databases. incrementally...

10.1145/233269.233324 article EN 1996-01-01

Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely studied problems this area is identification clusters, or densely populated regions, a multi-dimensional dataset. Prior work does not adequately address problem minimization I/O costs.This paper presents data clustering method named BIRCH (Balanced Iterative Reducing Clustering using Hierarchies), demonstrates that it especially suitable for very databases. incrementally...

10.1145/235968.233324 article EN ACM SIGMOD Record 1996-06-01

K-Anonymity has been proposed as a mechanism for protecting privacy in microdata publishing, and numerous recoding "models" have considered achieving ��anonymity. This paper proposes new multidimensional model, which provides an additional degree of flexibility not seen previous (single-dimensional) approaches. Often this leads to higher-quality anonymizations, measured both by general-purpose metrics more specific notions query answerability. Optimal anonymization is NP-hard (like optimal...

10.1109/icde.2006.101 article EN 2006-01-01

10.1023/a:1009783824328 article EN Data Mining and Knowledge Discovery 1997-01-01

Clustering is an important data mining problem. Most of the earlier work on clustering focussed numeric attributes which have a natural ordering their attribute values. Recently, with categorical attributes, whose values do not ordering, has received some attention. However, previous algorithms give formal description clusters they discover and them assume that user post-processes output algorithm to identify final clusters. In this paper, we introduce novel formalization cluster for by...

10.1145/312129.312201 article EN 1999-08-01

We introduce the Iceberg-CUBE problem as a reformulation of datacube (CUBE) problem. The is to compute only those group-by partitions with an aggregate value (e.g., count) above some minimum support threshold. result can be used (1) answer queries clause such HAVING COUNT(*) >= X, where X greater than threshold, (2) for mining multidimensional association rules, and (3) complement existing strategies identifying interesting subsets CUBE precomputation. present new algorithm (BUC)...

10.1145/304181.304214 article EN ACM SIGMOD Record 1999-06-01

Data management workloads are increasingly write-intensive and subject to strict latency SLAs. This presents a dilemma: Update in place systems have unmatched but poor write throughput. In contrast, existing log structured techniques improve throughput sacrifice read performance exhibit unacceptable spikes.

10.1145/2213836.2213862 article EN 2012-05-20

The networking and distributed systems communities have recently explored a variety of new network architectures, both for application-level overlay networks, as prototypes next-generation Internet architecture. In this context, we investigated declarative networking: the use recursive query engine powerful vehicle accelerating innovation in architectures [23, 24, 33]. Declarative represents significant application area database research on processing. paper, address fundamental issues...

10.1145/1142473.1142485 article EN 2006-06-27

Protecting data privacy is an important problem in microdata distribution. Anonymization algorithms typically aim to protect individual privacy, with minimal impact on the quality of resulting data. While bulk previous work has measured through one-size-fits-all measures, we argue that best judged respect workload for which will ultimately be used.This paper provides a suite anonymization produce anonymous view based target class workloads, consisting one or more mining tasks, as well...

10.1145/1150402.1150435 article EN 2006-08-20

In this paper we extend LDL, a Logic Based Database Language, to include finite sets and negation. The new language is called LDL1. We define the notion of model show that negation-free program need not have model, it may more than one minimal model. impose syntactic restriction in order deterministic language. These restrictions allow only layered (stratified) programs. prove for any satisfying layering, there can be constructed bottom-up fashion. Extensions basic grouping mechanism are...

10.1145/28659.28662 article EN 1987-06-01

Clustering partitions a collection of objects into groups called clusters, such that similar fall the same group. Similarity between is defined by distance function satisfying triangle inequality; this along with describes space. In space, only operation possible on data computation them. All scalable algorithms in literature assume special type namely k-dimensional vector which allows operations objects. We present two designed for clustering very large datasets spaces. Our first algorithm...

10.1109/icde.1999.754966 article EN 1999-01-01

Several methods have been proposed to evaluate queries over a native XML DBMS, where the specify both path and keyword constraints. These broadly consist of graph traversal approaches, optimized with auxiliary structures known as structure indexes; approaches based on information-retrieval style inverted lists. We propose strategy that combines two forms indexes, query evaluation algorithm for branching expressions this strategy. Our technique is general applicable wide range choices indexes...

10.1145/1007568.1007656 article EN 2004-06-13

In this paper we survey recent work on incremental data mining model maintenance and change detection under block evolution. evolution, a dataset is updated periodically through insertions deletions of blocks records at time. We describe two techniques: (1) generic algorithm for that takes any traditional transforms it into an allows restrictions temporal subset the database. (2) also framework detection, quantifies difference between datasets in terms models they induce.

10.1145/507515.507517 article EN ACM SIGKDD Explorations Newsletter 2002-01-01

10.1016/0743-1066(91)90026-l article EN publisher-specific-oa The Journal of Logic Programming 1991-10-01

We make the case for developing a web of concepts by starting with current view (comprised hyperlinked pages, or documents, each seen as bag words), extracting concept-centric metadata, and stitching it together to create semantically rich aggregate all information available on concept instance. The goal building maintaining such presents many challenges, but also offers promise enabling powerful applications, including novel search discovery paradigms. present goal, motivate example usage...

10.1145/1559795.1559797 article EN 2009-06-29

Several graph-based algorithms have been proposed in the literature to compute transitive closure of a directed graph. We develop two new (Basic_TC and Gobal_DFTC) compare performance their implementations disk-based environment with well-known algorithm by Schmitz. Our use depth-first search traverse graph technique called marking avoid processing some arcs They nodes reverse topological order, building descendent sets adding children. While details these differ considerably, one important...

10.1145/155271.155273 article EN ACM Transactions on Database Systems 1993-09-01

Important properties of users and objects will move from being tied to individual Web sites globally available.The conjunction a global object model with portable user context lead richer content structure introduce significant shifts in online communities information discovery.

10.1109/mc.2007.294 article EN Computer 2007-08-01

In this article, we consider whether traditional index structures are effective in processing unstable nearest neighbors workloads. It is known that under broad conditions, workloads become ---distances between data points indistinguishable from each other. We complement earlier result by showing if the workload for an application unstable, you not likely to be able it efficiently using (almost all known) multidimensional structures. For a class of distributions, prove these will do no...

10.1145/1166074.1166077 article EN ACM Transactions on Database Systems 2006-09-01
Coming Soon ...