- Parallel Computing and Optimization Techniques
- Interconnection Networks and Systems
- Algorithms and Data Compression
- Embedded Systems Design Techniques
- Advanced Data Storage Technologies
- Cellular Automata and Applications
- Complexity and Algorithms in Graphs
- VLSI and FPGA Design Techniques
- Low-power high-performance VLSI design
- Distributed and Parallel Computing Systems
- Optimization and Search Problems
- semigroups and automata theory
- Distributed systems and fault tolerance
- Machine Learning and Algorithms
- Computability, Logic, AI Algorithms
- DNA and Biological Computing
- Formal Methods in Verification
- Logic, programming, and type systems
- Software Testing and Debugging Techniques
- Advanced Graph Theory Research
- Stochastic Gradient Optimization Techniques
- VLSI and Analog Circuit Testing
- Analog and Mixed-Signal Circuit Design
- Software System Performance and Reliability
- Matrix Theory and Algorithms
IBM Research - Thomas J. Watson Research Center
2000-2023
University of Padua
2009-2019
Barro Colorado Island
2008
Cornell University
1985-2005
University of Illinois Chicago
1996-1999
University of Geneva
1997
Hong Kong Association of Registered Tour Co-ordinators
1997
Paderborn University
1997
HeidelbergCement (United States)
1997
École Normale Supérieure de Lyon
1997
A parallel algorithm, called adaptive bitonic sorting, that runs on a PRAC (parallel random access computer), shared-memory multiprocessor where fetch and store conflicts are disallowed, is proposed. On P processors PRAC, the algorithm presented here achieves optimal performance $TP = O(N\log N)$, for any computation time T in range $\Omega (\log ^2 N) \leqq N)$. Adaptive sorting also has small constant factor, since it performs less than $2N\log N$ comparisons, only handful of operations...
Article Free Access Share on BSP vs LogP Authors: Gianfranco Bilardi Dip. di Elettronica e Informatica, Università Padova, Italy and Dept. of Electrical Engineering Computer Science, University Illinois at Chicago, IL ILView Profile , Kieran T. Herley College Cork, Ireland IrelandView Andrea Pietracaprina Matematica Pura Applicata, ItalyView Geppino Pucci Paul Spirakis Technology Institute, Patras, Greece GreeceView Authors Info & Claims SPAA '96: Proceedings the eighth annual ACM symposium...
This paper deals with the spectral analysis of digital processes obtained through memoryless functions stationary Markov chains. Under assumption that chain is ergodic, not necessarily acyclic, closed form formulas are derived for both discrete part (spectral lines) and continuous density. The results applied at output a finite state sequential machine driven by chain, which represents general model several situations engineering interest. Finally, theory used to evaluate spectrum encoded...
Evaluates various proposed VLSI models of computation. While there is a consensus on the appraisal chip area, controversy remains with regard to computation time. Thus, authors have analyzed in detail propagation signals disperse lines. The results are expressed terms adimensional parameters characteristic any given fabrication technology. conclusion that both current and projected silicon technologies fall within realm capacitive model where dispersive line can be replaced by capacitance...
A generalization of a known class parallel sorting algorithms is presented, together with new interconnection to execute them. VLSI implementation also proposed, and its area-time performance discussed. It shown that an algorithm in the executable O(log n) time by chip occupying O(n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) area. The design typical instance ``hybrid architecture,'' resulting from combination well-known networks...
The control dependence relation plays a fundamental role in program restructuring and optimization. usual representation of this is the graph (CDG), but size CDG can grow quadratically with input programs, even for structured programs. In article, we introduce augmented postdominator tree (APT) , data structure which be constructed space time proportional to supports enumeration number useful sets their size. Therefore, APT provides an optimal dependence. Specifically, set cd(e), statements...
The Static Single Assignment (SSA) form is a program representation used in many optimizing compilers. key step converting to SSA called ϕ-placement. Many algorithms for ϕ-placement have been proposed the literature, but relationships between these are not well understood.In this article, we propose framework within which systematically derive (i) properties of and (ii) algorithms. This based on new relation merge captures succinctly structure program's control flow graph that relevant its...
The problem of simulating a PRAM with n processors and memory size $m \geqslant n$ on an n-node bounded degree network is considered. A deterministic algorithm presented that simulates arbitrary step in $O(({{\log n\log m)} / {\log \log n)}}$ time the worst case expander-based network. By extending previously established lower bound, it shown proposed simulation optimal whenever $\Omega (n^{1 + \epsilon } ) \leqslant m O(2^{(\log n)^\alpha )$ for some positive real constants $ $\alpha $.
We generalize the notion of dominance by defining a generalized relation with respect to set paths in control flow graph G = (V, E). This new definition leads dependence, which includes standard dependence and weak as special cases.If underlying satisfies some natural closure conditions, that is tree-structured. Given this tree, corresponding can be computed optimally reduction Roman Chariots Problem, we have developed previously for computing dependence. More precisely, given linear...
Abstract Solving fully coupled non‐linear hygro‐thermo‐mechanical problems relative to the behaviour of concrete at high temperatures using monolithic models is nowadays a very interesting and challenging computational problem. These require an extensive use resources, such as main memory time, due great number variables numerical characteristics coefficients linear systems involved. In this paper, different variants frontal solver used within HITECOSP, application developed BRITE Euram III...
We evaluate IBM's Enhanced Cell Broadband Engine (BE) as a possible building block of new generation lattice QCD machines.The BE will provide full support doubleprecision floating-point arithmetics, including IEEE-compliant rounding.We have developed performance model and applied it to relevant kernels.The estimates are supported by micro-and application-benchmarks that been obtained on currently available BE-based computers, such IBM QS20 blades PlayStation 3. The results encouraging show...