Guy E. Blelloch

ORCID: 0000-0003-0224-9187
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Distributed systems and fault tolerance
  • Algorithms and Data Compression
  • Distributed and Parallel Computing Systems
  • Complexity and Algorithms in Graphs
  • Advanced Data Storage Technologies
  • Graph Theory and Algorithms
  • Optimization and Search Problems
  • Computational Geometry and Mesh Generation
  • Interconnection Networks and Systems
  • Data Management and Algorithms
  • Logic, programming, and type systems
  • Advanced Graph Theory Research
  • Genomics and Phylogenetic Studies
  • Cloud Computing and Resource Management
  • Advanced Database Systems and Queries
  • Computability, Logic, AI Algorithms
  • Real-Time Systems Scheduling
  • Formal Methods in Verification
  • Embedded Systems Design Techniques
  • Scheduling and Optimization Algorithms
  • Caching and Content Delivery
  • Security and Verification in Computing
  • Genetic diversity and population structure
  • Advanced Graph Neural Networks

Carnegie Mellon University
2016-2025

Google (United States)
2025

University of Crete
2020

University of California, Riverside
2020

York University
2020

Kitware (United States)
2020

Massachusetts Institute of Technology
1986-2019

Technion – Israel Institute of Technology
2019

Laboratoire d'Informatique de Paris-Nord
1996-2010

Toyota Technological Institute at Chicago
2006

Current systems for graph computation require a distributed computing cluster to handle very large real-world problems, such as analysis on social networks or the web graph. While computational resources have become more accessible, developing algorithms still remains challenging, especially non-experts.In this work, we present GraphChi, disk-based system efficiently graphs with billions of edges. By using well-known method break into small parts, and novel parallel sliding windows method,...

10.5555/2387880.2387884 article EN Operating Systems Design and Implementation 2012-10-08

There has been significant recent interest in parallel frameworks for processing graphs due to their applicability studying social networks, the Web graph, networks biology, and unstructured meshes scientific simulation. Due desire process large graphs, these systems have emphasized ability run on distributed memory machines. Today, however, a single multicore server can support more than terabyte of memory, which fit with tens or even hundreds billions edges. Furthermore, graph algorithms,...

10.1145/2442516.2442530 article EN 2013-02-23

A study of the effects adding two scan primitives as unit-time to PRAM (parallel random access machine) models is presented. It shown that improve asymptotic running time many algorithms by an O(log n) factor, greatly simplifying description algorithms, and are significantly easier implement than memory references. argued algorithm designer should feel free use these operations if they were cheap a reference. The author describes five clearly illustrate how can be used in design: radix-sort...

10.1109/12.42122 article EN IEEE Transactions on Computers 1989-01-01

article Free Access Share on Programming parallel algorithms Author: Guy E. Blelloch Carnegie Mellon Univ., Pittsburgh, PA PAView Profile Authors Info & Claims Communications of the ACMVolume 39Issue 3March 1996 pp 85–97https://doi.org/10.1145/227234.227246Published:01 March 1996Publication History 301citation3,378DownloadsMetricsTotal Citations301Total Downloads3,378Last 12 Months168Last 6 weeks41 Get Citation AlertsNew Alert added!This alert has been successfully added and will be sent...

10.1145/227234.227246 article EN Communications of the ACM 1996-03-01

Abstract: Experienced algorithm designers rely heavily on a set of building blocks and the tools needed to put together into an algorithm. The understanding these basic is therefore critical algorithms. Many for parallel algorithms extend from sequential algorithms, such as dynamic-programming divide-and-conquer, but others are new. This paper introduces one simplest most useful algorithms: all-prefix-sums operation. defines operation, shows how implement it P-RAM illustrates many...

10.1184/r1/6608579.v1 article EN 2004-05-01

Article Free Access Share on A comparison of sorting algorithms for the connection machine CM-2 Authors: Guy E. Blelloch Carnegie Mellon University, Pittsburgh, PA PAView Profile , Charles Leiserson MIT, Cambridge, MA MAView Bruce M. Maggs NEC Research Institute, Princeton, NJ NJView C. Greg Plaxton University Texas Austin, TX TXView Stephen J. Smith Thinking Machines Corp. Marco Zagha Authors Info & Claims SPAA '91: Proceedings third annual ACM symposium Parallel and architecturesJune 1991...

10.1145/113379.113380 article EN 1991-01-01

This paper studies the data locality of work-stealing scheduling algorithm on hardware-controlled shared-memory machines. We present lower and upper bounds number cache misses using work stealing, introduce a locality-guided along with experimental validation.

10.1145/341800.341801 article EN 2000-07-09

We describe a parallel, real-time garbage collector and present experimental results that demonstrate good scalability bounds. The is designed for shared-memory multiprocessors based on an earlier algorithm [2], which provided fixed bounds the time any thread must pause collection. However, since our was simple analysis, it had some impractical features. This paper presents extensions necessary practical implementation: reducing excessive interleaving, handling stacks global variables,...

10.1145/378795.378823 article EN Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation 2001-05-01

This announcement describes the problem based benchmark suite (PBBS). PBBS is a set of benchmarks designed for comparing parallel algorithmic approaches, programming language styles, and machine architectures across broad problems. Each defined concretely in terms specification input distributions. No requirements are made approach, language, or architecture. The goal not only to compare runtimes, but also be able code other aspects an implementation (e.g., portability, robustness,...

10.1145/2312005.2312018 article EN 2012-06-25

There has been significant recent interest in parallel frameworks for processing graphs due to their applicability studying social networks, the Web graph, networks biology, and unstructured meshes scientific simulation. Due desire process large graphs, these systems have emphasized ability run on distributed memory machines. Today, however, a single multicore server can support more than terabyte of memory, which fit with tens or even hundreds billions edges. Furthermore, graph algorithms,...

10.1145/2517327.2442530 article EN ACM SIGPLAN Notices 2013-02-23

In chip multiprocessors (CMPs), limiting the number of offchip cache misses is crucial for good performance. Many multithreaded programs provide opportunities constructive sharing, in which concurrently scheduled threads share a largely overlapping working set. this paper, we compare performance two state-of-the-art schedulers proposed fine-grained programs: Parallel Depth First (PDF), specifically designed and Work Stealing (WS), more traditional design. Our experimental results indicate...

10.1145/1248377.1248396 article EN 2007-06-09

We study compression techniques for parallel in-memory graph algorithms, and show that we can achieve reduced space usage while obtaining competitive or improved performance compared to running the algorithms on uncompressed graphs. integrate into Ligra, a recent shared-memory processing system. This system, which call Ligra+, is able represent graphs using about half of average. Furthermore, Ligra+ slightly faster than Ligra average 40-core machine with hyper-threading. Our experimental...

10.1109/dcc.2015.8 article EN Data Compression Conference 2015-04-01

The virtues of deterministic parallelism have been argued for decades and many forms described analyzed. Here we are concerned with one the strongest forms, requiring that any input there is a unique dependence graph representing trace computation annotated every operation value. This has referred to as internal determinism, implies sequential semantics---i.e., considering traversal sufficient analyzing correctness code. In addition returning results, determinism advantages including ease...

10.1145/2145816.2145840 article EN 2012-02-25

In this paper we explore a simple and general approach for developing parallel algorithms that lead to good cache complexity on machines with private or shared caches. The is design nested-parallel have low depth (span, critical path length) which the natural sequential evaluation order has in cache-oblivious model. We describe several optimal work, polylogarithmic depth, complexities match best algorithms, including first such sorting sparse-matrix vector multiply matrices vertex...

10.1145/1810479.1810519 article EN 2010-06-13

The greedy sequential algorithm for maximal independent set (MIS) loops over the vertices in an arbitrary order adding a vertex to resulting if and only no previous neighboring has been added. In this loop, as many loops, each iterate will depend on subset of iterates (i.e. knowing that any one vertex's neighbors is MIS, or it neighbors, sufficient decide its fate way other). This leads dependence structure among iterates. If shallow then running parallel while respecting dependencies can...

10.1145/2312005.2312058 preprint EN 2012-06-25

Existing graph-processing frameworks let users develop efficient implementations for many graph problems, but none of them support efficiently bucketing vertices, which is needed bucketing-based algorithms such as \Delta-stepping and approximate set-cover. Motivated by the lack simple, scalable, algorithms, we Julienne framework, extends a recent shared-memory processing framework called Ligra with an interface maintaining collection buckets under vertex insertions bucket deletions.

10.1145/3087556.3087580 article EN 2017-07-20

There has been a growing interest in the graph-streaming setting where continuous stream of graph updates is mixed with queries. In principle, purely-functional trees are an ideal fit for this as they enable safe parallelism, lightweight snapshots, and strict serializability However, directly using them processing leads to significant space overhead poor cache locality.

10.1145/3314221.3314598 preprint EN 2019-06-07

10.1016/0743-7315(90)90087-6 article EN Journal of Parallel and Distributed Computing 1990-02-01

In this paper we prove time and space bounds for the implementation of programming language NESL on various parallel machine models. is a sugared typed λ-calculus with set array primitives an explicit map over arrays. Our results extend previous work provable functional languages by considering including For modeling cost augment standard call-by-value operational semantics to return two measures: DAG representing sequential dependence in computation, measure taken implementation. We show...

10.1145/232627.232650 article EN 1996-06-15

We present techniques for incremental computing by introducing adaptive functional programming. As an program executes, the underlying system represents data and control dependences in execution form of a dynamic dependence graph . When input to changes, change propagation algorithm updates output propagating changes through re-executing code where necessary. Adaptive programs adapt their any input, small or large.We show that adaptivity are practical giving efficient implementation as ML...

10.1145/1186632.1186634 article EN ACM Transactions on Programming Languages and Systems 2006-11-01
Coming Soon ...