Richard Sidle

ORCID: 0009-0001-7577-8816
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Database Systems and Queries
  • Advanced Data Storage Technologies
  • Data Management and Algorithms
  • Algorithms and Data Compression
  • Distributed systems and fault tolerance
  • Caching and Content Delivery
  • Data Stream Mining Techniques
  • Cloud Computing and Resource Management
  • Parallel Computing and Optimization Techniques
  • Data Mining Algorithms and Applications
  • Peer-to-Peer Network Technologies
  • Semantic Web and Ontologies
  • Data Quality and Management
  • Groundwater flow and contamination studies
  • Distributed and Parallel Computing Systems
  • Advanced Computational Techniques and Applications

IBM (Canada)
2016-2024

IBM Research - Almaden
2000-2019

IBM (United States)
2000-2019

University of Toronto
2002

DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times improve compression 3 times, compared traditional row-organized tables, without the complexity of indexes or materialized views on those tables. But is much more than just a column store. Exploiting frequency-based dictionary main-memory query technology from Blink project at IBM Research - Almaden,...

10.14778/2536222.2536233 article EN Proceedings of the VLDB Endowment 2013-08-01

Query performance in current systems depends significantly on tuning: how well the query matches available indexes, materialized views etc. Even a tuned system, there are always some queries that take much longer than others. This frustrates users who increasingly want consistent response times to ad hoc queries. We argue processors should instead aim for constant all queries, with no assumption about tuning. present Blink, our first attempt at this goal, runs every as table scan over fully...

10.1109/icde.2008.4497414 article EN 2008-04-01

We present new hash tables for joins, and a join based on them, that consumes far less memory is usually faster than recently published in-memory joins. Our not restricted to outer fit wholly in memory. Key this concise table (CHT), linear probing has 100% fill factor, uses sparse bitmap with embedded population counts almost entirely avoid collisions. This also serves as Bloom filter use multi-table study the random access characteristics of renew case non-partitioned introduce variant...

10.14778/2735496.2735499 article EN Proceedings of the VLDB Endowment 2014-12-01

Table scans have become more interesting recently due to greater use of ad-hoc queries and availability multi-core, vector-enabled hardware. scan performance is limited by value representation, table layout, processing techniques. In this paper we propose a new layout technique for efficient one-pass predicate evaluation. Starting with set rows fixed number bits per column, append columns form banks then pad each bank supported machine word length, typically 16, 32, or 64 bits. We evaluate...

10.14778/1453856.1453925 article EN Proceedings of the VLDB Endowment 2008-08-01

Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These are built top of the most frequently queried columns according statistics. Unfortunately, maintaining multiple same can be extremely space consuming, causing significant performance degradation due potential exhaustion memory space. In this paper, we demonstrate that there exist many opportunities exploit column correlations for...

10.1145/3299869.3319861 article EN Proceedings of the 2022 International Conference on Management of Data 2019-06-18

Although the DRAM for main memories of systems continues to grow exponentially according Moore's Law and become less expensive, we argue that memory hierarchies will always exist many reasons, both economic practical, in particular due concurrent users competing working perform joins grouping. We present in-memory BLU Acceleration used IBM's DB2 Linux, UNIX, Windows, now also dashDB cloud offering, which was designed implemented from ground up exploit but is not limited what fits does...

10.1109/icde.2015.7113372 article EN 2015-04-01

We demonstrate Hybrid Transactional and Analytics Processing (HTAP) on the Spark platform by Wildfire prototype, which can ingest up to ~6 million inserts per second node simultaneously perform complex SQL analytics queries. Here, a simplified mobile application uses recommend advertising customers based upon their distance from stores interest in products sold these stores, while continuously graphing results as those move respond ads with purchases.

10.1145/2882903.2899406 article EN Proceedings of the 2022 International Conference on Management of Data 2016-06-16

Compression has historically been used to reduce the cost of storage, I/Os from that and buffer pool utilization, at expense CPU required decompress data every time it is queried. However, significant additional efficiencies can be achieved by deferring decompression as late in query processing possible performing operations directly on still-compressed data. In this paper, we investigate benefits challenges joins compressed (or encoded) We demonstrate benefit independently optimizing...

10.14778/2733004.2733008 article EN Proceedings of the VLDB Endowment 2014-08-01

The requirements of Internet Things (IoT) workloads are unique in the database space. While significant effort has been spent over last decade rearchitecting OLTP and Analytics for public cloud, little done to rearchitect IoT cloud. In this paper we present IBM Db2 Event Store ™ , a cloud-native system designed specifically workloads, which require extremely high-speed ingest, efficient open data storage, near real-time analytics. Additionally, by leveraging SQL compiler, optimizer runtime,...

10.14778/3415478.3415552 article EN Proceedings of the VLDB Endowment 2020-08-01

Materialized views (or Automatic Summary Tables—ASTs) are commonly used to improve the performance of aggregation queries by orders magnitude. In contrast regular tables, ASTs synchronized database system. this paper, we present techniques for maintaining cube ASTs. Our implementation is based on IBM DB2 UDB.

10.1145/335191.335454 article EN ACM SIGMOD Record 2000-05-16

In a classic transactional distributed database management system (DBMS), write transactions invariably synchronize with coordinator before final commitment. While enforcing serializability, this model has long been criticized for not satisfying the applications' availability requirements. When entering era of Internet Things (IoT), problem become more severe, as an increasing number applications call capability hybrid and analytical processing (HTAP), where aggregation constraints need to...

10.1109/bigdata47090.2019.9006519 article EN 2021 IEEE International Conference on Big Data (Big Data) 2019-12-01

Materialized views (or Automatic Summary Tables—ASTs) are commonly used to improve the performance of aggregation queries by orders magnitude. In contrast regular tables, ASTs synchronized database system. this paper, we present techniques for maintaining cube ASTs. Our implementation is based on IBM DB2 UDB.

10.1145/342009.335454 article EN 2000-05-16

Database systems built on traditional storage subsystems typically store their data in small blocks referred to as pages (commonly sized a multiple of 4KB for historical reasons). These subsystems, example network attached block storage, were designed efficient random-access I/O patterns at the level, and size is usually configurable by application based its needs. For large scale analytic databases cloud environments, these are not cost effective when compared object database that exploit...

10.1145/3626246.3653393 article EN 2024-05-23

Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These are built top of the most frequently queried columns according statistics. Unfortunately, maintaining multiple same can be extremely space consuming, causing significant performance degradation due potential exhaustion memory space. However, we find that there indeed exist many opportunities save storage by exploiting column correlations....

10.14778/3352063.3352090 article EN Proceedings of the VLDB Endowment 2019-08-01

Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These are built top of the most frequently queried columns according statistics. Unfortunately, maintaining multiple same can be extremely space consuming, causing significant performance degradation due potential exhaustion memory space. In this paper, we demonstrate that there exist many opportunities exploit column correlations for...

10.48550/arxiv.1903.11203 preprint EN other-oa arXiv (Cornell University) 2019-01-01

In a classic transactional distributed database management system (DBMS), write transactions invariably synchronize with coordinator before final commitment. While enforcing serializability, this model has long been criticized for not satisfying the applications' availability requirements. When entering era of Internet Things (IoT), problem become more severe, as an increasing number applications call capability hybrid and analytical processing (HTAP), where aggregation constraints need to...

10.48550/arxiv.1908.01908 preprint EN other-oa arXiv (Cornell University) 2019-01-01
Coming Soon ...