- Cloud Computing and Resource Management
- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Distributed and Parallel Computing Systems
- Interconnection Networks and Systems
- Caching and Content Delivery
- Genomics and Phylogenetic Studies
- Advanced Database Systems and Queries
- Advanced Memory and Neural Computing
- Algorithms and Data Compression
- Semiconductor Lasers and Optical Devices
- IoT and Edge/Fog Computing
- Evolutionary Algorithms and Applications
- Embedded Systems Design Techniques
IIT@MIT
2023
Moscow Institute of Thermal Technology
2020
University of California, Los Angeles
2018-2019
UCLA Health
2019
Microsoft Research (India)
2017
Microsoft Research (United Kingdom)
2017
Performance of in-memory key-value store (KVS) continues to be great importance as modern KVS goes beyond the traditional object-caching workload and becomes a key infrastructure support distributed main-memory computation in data centers. Recent years have witnessed rapid increase network bandwidth centers, shifting bottleneck most from CPU. RDMA-capable NIC partly alleviates problem, but primitives provided by RDMA abstraction are rather limited. Meanwhile, programmable NICs become...
In genome sequencing, it is a crucial but time-consuming task to detect potential overlaps between any pair of the input reads, especially those that are ultra-long. The state-of-the-art overlapping tool Minimap2 outperforms other popular tools in speed and accuracy. It has single computing hot-spot, chaining, takes 70% time needs be accelerated. There several issues for hardware acceleration because nature chaining. First, original computation pattern poorly parallelizable direct...
In recent years we have witnessed the emergence of FPGA in many high-performance systems. This is due to FPGA's high reconfigurability and improved user-friendly programming environment. OpenCL, supported by major vendors, a high-level platform that liberates hardware developers from having deal with complex error-prone HDL development. While OpenCL exposes GPU-like model, which well-suited for compute-intensive tasks, state-of-art systems deploy FPGA, observe workloads are streaming-like,...
In conventional Hadoop MapReduce applications, I/O used to play a heavy role in the overall system performance. More recently, study from Apache Spark community- state-of-the-art in-memory cluster computing framework- reports that is no longer bottleneck and has marginal performance impact on applications like SQL processing. However, we observe simply replacing HDDs with SSDs can have over 10x improvement for certain stages large-scale production-quality genome Therefore, one key question...
Limited by the small on-chip memory, hardware-based transport typically implements go-back-N loss recovery mechanism, which costs very few memory but is well-known to perform inferior even under packet ratio. We present MELO, an efficient selective retransmission mechanism for transport, consumes only a constant regardless of number concurrent connections. Specifically, MELO employs architectural separation between data and meta storage uses shared bits pool allocation reduce footprint. By...
Today's clouds are inefficient: their utilization of resources like CPUs, GPUs, memory, and storage is low. This inefficiency occurs because applications consume at variable rates ratios, while offer fixed ratios. mismatch offering consumption styles prevents fully realizing the utility computing vision.
Storage drive technology has made continuous improvements over the last decade, shifting bottleneck of data processing system from storage to host/drive interconnection. To overcome this “data movement wall,” people have proposed in-storage computing (ISC) architectures which add unit directly into drive. Rather than moving host, it offloads computation host drive, thereby alleviating interconnection bottleneck. Though existing work shows effectiveness ISC under some specific workloads, they...