NFDI4DS | UHH-SEMS - Publication Details

Parag Agrawal

ORCID: 0009-0005-0759-8484

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5075956031

Research Areas

Advanced Database Systems and Queries
Data Management and Algorithms
Topic Modeling
Semantic Web and Ontologies
Data Quality and Management
Natural Language Processing Techniques
Misinformation and Its Impacts
Cloud Computing and Resource Management
Parallel Computing and Optimization Techniques
Service-Oriented Architecture and Web Services
AI in Service Interactions
Web Data Mining and Analysis
Advanced Data Storage Technologies
Scientific Computing and Data Management
Sentiment Analysis and Opinion Mining
Software System Performance and Reliability
Recommender Systems and Techniques
Digital Marketing and Social Media
Data Mining Algorithms and Applications
Caching and Content Delivery
Web Application Security Vulnerabilities
Software Engineering Research
Real-Time Systems Scheduling
Genetic and Kidney Cyst Diseases
Infectious Diseases and Mycology

Index Medical College, Hospital & Research Centre
2024

Tata Consultancy Services (India)
2021-2023

LinkedIn (United States)
2019-2023

Poornima University
2021

Microsoft (Finland)
2020

Microsoft Research (United Kingdom)
2019-2020

Microsoft (United States)
2018

Microsoft (India)
2017

Twitter (United States)
2014

Stanford University
2006-2011

The case for RAMClouds

OPENALEX - Publications

John K. Ousterhout Parag Agrawal David Erickson Christos Kozyrakis Jacob Leverich and 8 more

Disk-oriented approaches to online storage are becoming increasingly problematic: they do not scale gracefully meet the needs of large-scale Web applications, and improvements in disk capacity have far outstripped access latency bandwidth. This paper argues for a new approach datacenter called RAMCloud, where information is kept entirely DRAM systems created by aggregating main memories thousands commodity servers. We believe that RAMClouds can provide durable available with 100-1000x...

10.1145/1713254.1713276 article EN ACM SIGOPS Operating Systems Review 2010-01-27

Trio: a system for data, uncertainty, and lineage

OPENALEX - Publications

Parag Agrawal Omar Benjelloun Anish Das Sarma Chris Hayworth Shubha U. Nabar and 2 more

In the Trio project at Stanford, we are building a new kind of database management system: one in which data, uncertainty and data lineage all first-class citizens. is based on an extended relational model called ULDBs, it supports SQL-based query language TriQL. was motivated by number applications including scientific management, cleaning integration, information extraction systems, others. We have completed initial working prototype system. will demonstrate our illustrating through two...

10.5555/1182635.1164231 article EN 2006-09-01

The case for RAMCloud

OPENALEX - Publications

John K. Ousterhout Parag Agrawal David Erickson Christos Kozyrakis Jacob Leverich and 9 more

With scalable high-performance storage entirely in DRAM, RAMCloud will enable a new breed of data-intensive applications.

10.1145/1965724.1965751 article EN Communications of the ACM 2011-06-28

Interpretable and informative explanations of outcomes

OPENALEX - Publications

Kareem El Gebaly Parag Agrawal Lukasz Golab Flip Korn Divesh Srivastava

In this paper, we solve the following data summarization problem: given a multi-dimensional set augmented with binary attribute, how can construct an interpretable and informative summary of factors affecting attribute in terms combinations values dimension attributes? We refer to such summaries as explanation tables. show hardness constructing optimally-informative tables from data, propose effective efficient heuristics. The proposed heuristics are based on sampling include optimizations...

10.14778/2735461.2735467 article EN Proceedings of the VLDB Endowment 2014-09-01

Scheduling shared scans of large data files

OPENALEX - Publications

Parag Agrawal Daniel Kifer Christopher Olston

We study how best to schedule scans of large data files, in the presence many simultaneous requests a common set files. The objective is maximize overall rate processing these by sharing same file as aggressively possible, without imposing undue wait time on individual jobs. This scheduling problem arises batch environments such Map-Reduce systems, some which handle tens thousands daily, over shared As we demonstrate, conventional techniques shortest-job-first do not perform well cross-job...

10.14778/1453856.1453960 article EN Proceedings of the VLDB Endowment 2008-08-01

Asynchronous view maintenance for VLSD databases

OPENALEX - Publications

Parag Agrawal Adam Silberstein Brian F. Cooper Utkarsh Srivastava Raghu Ramakrishnan

The query models of the recent generation very large scale distributed (VLSD) shared-nothing data storage systems, including our own PNUTS and others (e.g. BigTable, Dynamo, Cassandra, etc.) are intentionally simple, focusing on simple lookups scans trading expressiveness for massive scale. Indexes views can expand such systems by materializing more complex access paths results. In this paper, we examine mechanisms to implement indexes in a database. For web applications, minimizing update...

10.1145/1559845.1559866 article EN 2009-06-29

On indexing error-tolerant set containment

OPENALEX - Publications

Parag Agrawal Arvind Arasu Raghav Kaushik

Prior work has identified set based comparisons as a useful primitive for supporting wide variety of similarity functions in record matching. Accordingly, various techniques have been proposed to improve the performance lookups. However, this body focuses almost exclusively on symmetric notions similarity. In paper, we study indexing problem asymmetric Jaccard containment function that is an error-tolerant variation containment. We enhance also account string transformations reflect synonyms...

10.1145/1807167.1807267 article EN 2010-06-06

Foundations of uncertain-data integration

OPENALEX - Publications

Parag Agrawal Anish Das Sarma Jeffrey D. Ullman Jennifer Widom

There has been considerable past work studying data integration and uncertain in isolation. We develop the foundations for local-as-view (LAV) when sources being integrated are uncertain. motivate two distinct settings uncertain-data integration. then define containment of databases these settings, which allows us to express as views over a virtual mediated database. Next, we consistency set show intractability consistency-checking. identify an interesting special case consistency-checking...

10.14778/1920841.1920976 article EN Proceedings of the VLDB Endowment 2010-09-01

NELEC at SemEval-2019 Task 3: Think Twice Before Going Deep

OPENALEX - Publications

Parag Agrawal Anshuman Suri

Existing Machine Learning techniques yield close to human performance on text-based classification tasks. However, the presence of multi-modal noise in chat data such as emoticons, slang, spelling mistakes, code-mixed data, etc. makes existing deep-learning solutions perform poorly. The inability systems robustly capture these covariates puts a cap their performance. We propose NELEC: Neural and Lexical Combiner, system which elegantly combines textual based methods for sentiment...

10.18653/v1/s19-2045 article EN cc-by 2019-01-01

Confidence-Aware Join Algorithms

OPENALEX - Publications

Parag Agrawal Jennifer Widom

In uncertain and probabilistic databases, confidence values ( xmlns:xlink="http://www.w3.org/1999/xlink">or xmlns:xlink="http://www.w3.org/1999/xlink">probabilities ) are associated with each data item. Confidence assigned to query results based on combining confidences from the input data. Users may wish apply a threshold result confidence values, ask for "top-k" by confidence, or obtain...

10.1109/icde.2009.141 article EN Proceedings - International Conference on Data Engineering 2009-03-01

Mycobacterium chelonae causing chronic wound infection and abdominal incisional hernia

OPENALEX - Publications

Susan T. Verghese Parag Agrawal Santosh Benjamin

Mycobacterium chelonae is a rapidly growing mycobacterium that found all over the environment, including sewage and tap water. They are important species associated with chronic non-healing wounds. We report case in 41 year old female patient who underwent multiple surgeries for an ovarian cyst, tubo-ovarian abscesses peritonitis repair of abdominal incisional hernia.

10.4103/0377-4929.134736 article EN cc-by-nc-sa Indian Journal of Pathology and Microbiology 2014-01-01

Mining Strengths and Weaknesses of Cricket Players Using Short Text Commentary

OPENALEX - Publications

Swarup Ranjan Behera Parag Agrawal Amit Awekar Vijaya Saradhi Vedula

Knowledge of strengths and weaknesses players is the key for team selection strategy planning in any sport such as Cricket. Computationally, this problem mostly unexplored. Existing methods focus only on aggregate macroscopic statistics that ignore many details. The central idea our paper to mine strength weakness rules using short text commentary data. This dataset compact, semi-structured, accurate, yet ignored by machine learning community until now. We collect fine-grained information...

10.1109/icmla.2019.00122 article EN 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) 2019-12-01

A case study for distributed query processing

OPENALEX - Publications

Parag Agrawal Dina Bitton K.-C. Guh C. Liu C. Yu

(i) (ii) (is) In the integrated strategy, one component decides which relation should remain fragmented at different sites. The other local operations, selections and projections, be performed before join operations. Our experimental results reveal that choices made by algorithm in deciding operations to are valid. More precisely , response times of queries processed lower than those same using strategies. These agree with analytic cost model we have previously proposed [YGC87]. addition,...

10.5555/62597.62612 article EN International Symposium on Databases for Parallel and Distributed Systems 1988-12-05

QnAMaker: Data to Bot in 2 Minutes

OPENALEX - Publications

Parag Agrawal Tulasi Menon Aya Kam Michel Naim Chaikesh Chouragade and 13 more

Having a bot for seamless conversations is much-desired feature that products and services today seek their websites mobile apps. These bots help reduce traffic received by human support significantly handling frequent directly answerable known questions. Many such have huge reference documents as FAQ pages, which makes it hard users to browse through this data. A conversation layer over raw data can lower great margin. We demonstrate QnAMaker, service creates conversational semi-structured...

10.1145/3366424.3383525 article EN Companion Proceedings of the The Web Conference 2018 2020-04-20

A 12-YEAR MALE WITH MULTIPLE NECK SWELLINGS

OPENALEX - Publications

Parag Agrawal Abhyudaya Verma Rohit Bhangadiya

Introduction: Papillary thyroid carcinoma is the most common pediatric malignancy representing 85–95 % of cases. Pediatric cancers typically present as neck masses with no associated symptoms and thus come to medical attention at widely varying stages disease progression. In contrast adult PTC, PTC tends be more aggressive presentation higher incidence multifocality, extracapsular extension Lymph node distant metastasis. Although a lifetime recurrence rate high, mortality rates are still...

10.4103/trp.trp_16_24 article EN Thyroid Research and Practice 2024-05-01

Promoting Inactive Members in Edge-Building Marketplace

OPENALEX - Publications

Ayan Acharya Siyuan Gao Borja Ocejo Kinjal Basu Ankan Saha and 4 more

Social networks are platforms where content creators and consumers share consume content. The edge recommendation system, which determines who a member should connect with, significantly impacts the reach engagement of audience on such networks. This paper emphasizes improving experience inactive members (IMs) do not have large connection network by recommending better connections. To that end, we propose multi-objective linear optimization framework solve it using accelerated gradient...

10.1145/3543873.3587647 article EN 2023-04-28

Coming Soon ...