NFDI4DS | UHH-SEMS - Publication Details

Zuhair Khayyat

ORCID: 0000-0003-3650-6997

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5078436323

Research Areas

Graph Theory and Algorithms
Cloud Computing and Resource Management
Advanced Database Systems and Queries
Data Management and Algorithms
Data Mining Algorithms and Applications
Data Quality and Management
Advanced Image and Video Retrieval Techniques
Distributed and Parallel Computing Systems
Advanced Graph Theory Research
Complex Network Analysis Techniques
Big Data and Business Intelligence
Big Data Technologies and Applications
Sentiment Analysis and Opinion Mining
Topic Modeling
Data Visualization and Analytics
Algorithms and Data Compression
Graph Labeling and Dimension Problems
Spam and Phishing Detection
Privacy-Preserving Technologies in Data
Scientific Computing and Data Management
Advanced Graph Neural Networks
Advanced Text Analysis Techniques
Caching and Content Delivery
Cloud Data Security Solutions
Semantic Web and Ontologies

King Abdullah University of Science and Technology
2013-2019

Mizan

OPENALEX - Publications

Zuhair Khayyat Karim Awara Amani AlOnazi Hani Jamjoom Dan Williams and 1 more

Pregel [23] was recently introduced as a scalable graph mining system that can provide significant performance improvements over traditional MapReduce implementations. Existing implementations focus primarily on partitioning preprocessing step to balance computation across compute nodes. In this paper, we examine the runtime characteristics of system. We show alone is insufficient for minimizing end-to-end computation. Especially where data very large or behavior algorithm unknown, an...

10.1145/2465351.2465369 article EN 2013-04-15

BigDansing

OPENALEX - Publications

Zuhair Khayyat Ihab F. Ilyas Alekh Jindal Samuel Madden Mourad Ouzzani and 4 more

Data cleansing approaches have usually focused on detecting and fixing errors with little attention to scaling big datasets. This presents a serious impediment since data often involves costly computations such as enumerating pairs of tuples, handling inequality joins, dealing user-defined functions. In this paper, we present BigDansing, Big Cleansing system tackle efficiency, scalability, ease-of-use issues in cleansing. The can run top most common general purpose processing platforms,...

10.1145/2723372.2747646 article EN 2015-05-27

A survey and experimental comparison of distributed SPARQL engines for very large RDF data

OPENALEX - Publications

Ibrahim Abdelaziz Razen Harbi Zuhair Khayyat Panos Kalnis

Distributed SPARQL engines promise to support very large RDF datasets by utilizing shared-nothing computer clusters. Some are based on distributed frameworks such as MapReduce; others implement proprietary processing; and some rely expensive preprocessing for data partitioning. These systems exhibit a variety of trade-offs that not well-understood, due the lack any comprehensive quantitative qualitative evaluation. In this paper, we present survey 22 state-of-the-art cover entire spectrum...

10.14778/3151106.3151109 article EN Proceedings of the VLDB Endowment 2017-09-01

ScaleMine: Scalable Parallel Frequent Subgraph Mining in a Single Large Graph

OPENALEX - Publications

Ehab Abdelhamid Ibrahim Abdelaziz Panos Kalnis Zuhair Khayyat Fuad Jamour

Frequent Subgraph Mining is an essential operation for graph analytics and knowledge extraction. Due to its high computational cost, parallel solutions are necessary. Existing approaches either suffer from load imbalance, or communication synchronization overheads. In this paper we propose ScaleMine; a novel frequent subgraph mining system single large graph. ScaleMine introduces two-phase approach. The first phase approximate; it quickly identifies subgraphs that with probability, while...

10.1109/sc.2016.60 article EN 2016-11-01

Scalemine: scalable parallel frequent subgraph mining in a single large graph

OPENALEX - Publications

Ehab Abdelhamid Ibrahim Abdelaziz Panos Kalnis Zuhair Khayyat Fuad Jamour

10.5555/3014904.3014986 article EN IEEE International Conference on High Performance Computing, Data, and Analytics 2016-11-13

Rheem

OPENALEX - Publications

Divy Agrawal Lamine Ba Laure Berti‐Équille Sanjay Chawla Ahmed K. Elmagarmid and 10 more

Many emerging applications, from domains such as healthcare and oil & gas, require several data processing systems for complex analytics. This demo paper showcases system, a framework that provides multi-platform task execution applications. It features three-layer abstraction new query optimization approach settings. We will demonstrate the strengths of system by using real-world scenarios three different namely, machine learning, cleaning, fusion.

10.1145/2882903.2899414 article EN Proceedings of the 2022 International Conference on Management of Data 2016-06-16

Lightning fast and space efficient inequality joins

OPENALEX - Publications

Zuhair Khayyat William Lucia Meghna Singh Mourad Ouzzani Paolo Papotti and 3 more

Inequality joins, which join relational tables on inequality conditions, are used in various applications. While there have been a wide range of optimization methods for joins database systems, from algorithms such as sort-merge and band join, to indices B + -tree, R * -tree Bitmap, received little attention queries containing usually very slow. In this paper, we introduce fast algorithms. We put columns be joined sorted arrays use permutation encode positions tuples one array w.r.t. the...

10.14778/2831360.2831362 article EN Proceedings of the VLDB Endowment 2015-09-01

Fast and scalable inequality joins

OPENALEX - Publications

Zuhair Khayyat William Lucia Meghna Singh Mourad Ouzzani Paolo Papotti and 3 more

10.1007/s00778-016-0441-6 article EN The VLDB Journal 2016-09-07

ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset

OPENALEX - Publications

Basma Alharbi Hind Alamro Manal Alshehri Zuhair Khayyat Manal Kalkatawi and 2 more

This paper provides a detailed description of new Twitter-based benchmark dataset for Arabic Sentiment Analysis (ASAD), which is launched in competition3, sponsored by KAUST awarding 10000 USD, 5000 USD and 2000 to the first, second third place winners, respectively. Compared other publicly released datasets, ASAD large, high-quality annotated dataset(including 95K tweets), with three-class sentiment labels (positive, negative neutral). We presents details data collection process annotation...

10.48550/arxiv.2011.00578 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST

OPENALEX - Publications

Hind Alamro Manal Alshehri Basma Alharbi Zuhair Khayyat Manal Kalkatawi and 2 more

This paper provides an overview of the Arabic Sentiment Analysis Challenge organized by King Abdullah University Science and Technology (KAUST). The task in this challenge is to develop machine learning models classify a given tweet into one three categories Positive, Negative, or Neutral. From our recently released ASAD dataset, we provide competitors with 55K tweets for training, 20K validation, based on which performance participating teams are ranked leaderboard,...

10.48550/arxiv.2109.14456 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Errata for "Lightning Fast and Space Efficient Inequality Joins" (PVLDB 8(13): 2074--2085)

OPENALEX - Publications

Zuhair Khayyat William Lucia Meghna Singh Mourad Ouzzani Paolo Papotti and 3 more

This is in response to recent feedback from some readers, which requires clarifications regarding our IEJ oin algorithm published [1]. The revolves around four points: (1) a typo illustrating example of the join process; (2) naming error for index used by improve bit array scan; (3) sort order algorithms; and (4) missing explanation on how duplicates are handled self algorithm.

10.14778/3099622.3099629 article EN Proceedings of the VLDB Endowment 2017-05-01

Coming Soon ...