- Advanced Database Systems and Queries
- Data Management and Algorithms
- Advanced Data Storage Technologies
- Algorithms and Data Compression
- Caching and Content Delivery
- Image Retrieval and Classification Techniques
- Advanced Image and Video Retrieval Techniques
- Distributed systems and fault tolerance
- Cloud Computing and Resource Management
- Data Stream Mining Techniques
- Innovation in Digital Healthcare Systems
- Software-Defined Networks and 5G
- Parallel Computing and Optimization Techniques
- Distributed and Parallel Computing Systems
- Technology and Data Analysis
- Peer-to-Peer Network Technologies
- Advanced Computational Techniques and Applications
- Industrial Vision Systems and Defect Detection
IBM (United States)
1994-2024
IBM Research - Almaden
2002-2019
DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times improve compression 3 times, compared traditional row-organized tables, without the complexity of indexes or materialized views on those tables. But is much more than just a column store. Exploiting frequency-based dictionary main-memory query technology from Blink project at IBM Research - Almaden,...
We present new hash tables for joins, and a join based on them, that consumes far less memory is usually faster than recently published in-memory joins. Our not restricted to outer fit wholly in memory. Key this concise table (CHT), linear probing has 100% fill factor, uses sparse bitmap with embedded population counts almost entirely avoid collisions. This also serves as Bloom filter use multi-table study the random access characteristics of renew case non-partitioned introduce variant...
Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These are built top of the most frequently queried columns according statistics. Unfortunately, maintaining multiple same can be extremely space consuming, causing significant performance degradation due potential exhaustion memory space. In this paper, we demonstrate that there exist many opportunities exploit column correlations for...
Although the DRAM for main memories of systems continues to grow exponentially according Moore's Law and become less expensive, we argue that memory hierarchies will always exist many reasons, both economic practical, in particular due concurrent users competing working perform joins grouping. We present in-memory BLU Acceleration used IBM's DB2 Linux, UNIX, Windows, now also dashDB cloud offering, which was designed implemented from ground up exploit but is not limited what fits does...
We demonstrate Hybrid Transactional and Analytics Processing (HTAP) on the Spark platform by Wildfire prototype, which can ingest up to ~6 million inserts per second node simultaneously perform complex SQL analytics queries. Here, a simplified mobile application uses recommend advertising customers based upon their distance from stores interest in products sold these stores, while continuously graphing results as those move respond ads with purchases.
Compression has historically been used to reduce the cost of storage, I/Os from that and buffer pool utilization, at expense CPU required decompress data every time it is queried. However, significant additional efficiencies can be achieved by deferring decompression as late in query processing possible performing operations directly on still-compressed data. In this paper, we investigate benefits challenges joins compressed (or encoded) We demonstrate benefit independently optimizing...
On-line collections of images are growing larger and more common, tools needed to efficiently manage, organize, navigate through them. The authors have developed a prototype system called QBIC which allows complex multi-object multi-feature queries large image databases. based on content-the colors, textures, shapes, positions the objects/regions they contain. computes numeric features represent properties uses similarity measures these for retrieval. focus paper is its user interface...
We describe how the QBIC (Query By Image Content) system handles "multi-*" queries-queries on large image collections involving multifeatures of each as a whole and multiple objects within image. The queries are based properties content-such colors, textures, shapes, edges. computes set features to above properties, uses distance-like measures provide similarity retrieval, has graphical interface that enable users pose visually. In this paper, we present indexing algorithms allow these run...
The requirements of Internet Things (IoT) workloads are unique in the database space. While significant effort has been spent over last decade rearchitecting OLTP and Analytics for public cloud, little done to rearchitect IoT cloud. In this paper we present IBM Db2 Event Store ™ , a cloud-native system designed specifically workloads, which require extremely high-speed ingest, efficient open data storage, near real-time analytics. Additionally, by leveraging SQL compiler, optimizer runtime,...
The QBIC (query by image content) project in the IBM Almaden Research Center San Jose, CA, is conducting a theoretical, experimental, and prototyping study of problem querying large still databases efficiently based on content. Since difficult, aim to discover general principles, but at same time identify target application(s) for which concrete pilot systems will be prototyped. A number algorithms have been developed that allow user search color, texture, shape. can focused either objects...
IBM Almaden Research Center's project on Query By Image Content (QBIC) is studying means to retrieve images from large image databases using contents such as color, texture, shape and layout. In this paper, we describe the beta version of PC-based Ultimedia Manager product, which based QBIC technology. We outline product philosophy give a demonstration current version. The expected be announced soon, together with an OEM offering search query engine.< <ETX...
In a classic transactional distributed database management system (DBMS), write transactions invariably synchronize with coordinator before final commitment. While enforcing serializability, this model has long been criticized for not satisfying the applications' availability requirements. When entering era of Internet Things (IoT), problem become more severe, as an increasing number applications call capability hybrid and analytical processing (HTAP), where aggregation constraints need to...
Database systems built on traditional storage subsystems typically store their data in small blocks referred to as pages (commonly sized a multiple of 4KB for historical reasons). These subsystems, example network attached block storage, were designed efficient random-access I/O patterns at the level, and size is usually configurable by application based its needs. For large scale analytic databases cloud environments, these are not cost effective when compared object database that exploit...
Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These are built top of the most frequently queried columns according statistics. Unfortunately, maintaining multiple same can be extremely space consuming, causing significant performance degradation due potential exhaustion memory space. However, we find that there indeed exist many opportunities save storage by exploiting column correlations....
Database administrators construct secondary indexes on data tables to accelerate query processing in relational database management systems (RDBMSs). These are built top of the most frequently queried columns according statistics. Unfortunately, maintaining multiple same can be extremely space consuming, causing significant performance degradation due potential exhaustion memory space. In this paper, we demonstrate that there exist many opportunities exploit column correlations for...
In a classic transactional distributed database management system (DBMS), write transactions invariably synchronize with coordinator before final commitment. While enforcing serializability, this model has long been criticized for not satisfying the applications' availability requirements. When entering era of Internet Things (IoT), problem become more severe, as an increasing number applications call capability hybrid and analytical processing (HTAP), where aggregation constraints need to...
Quality aspects are more important every day, since they have a major impact on the final product. The automobile industry is not unaware of this fact, being an issue car’s sheet quality analysis. Nowadays, most systems for analysis implemented outside production line and performed manually. In work, autonomous system proposed in order to enable automatic quantification classification imperfections produced sheets composing auto bodywork due squeezing process. consists motorized capture...
In data centers today, servers are stationary and flows on a hierarchical network of switches routers. But such static server arrangements require very scalable networks, many applications bottlenecked by bandwidth. addition, density is kept low to enable maintenance upgrades, as well increase air flow. this paper, we propose design in which move physically, communicate via point-to-point connections (instead switches). We argue that allows transfer bandwidth scale linearly with the number...