NFDI4DS | UHH-SEMS - Publication Details

Peter M. Kogge

ORCID: 0000-0002-3329-547X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5029212199

Research Areas

Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Interconnection Networks and Systems
Distributed and Parallel Computing Systems
Quantum-Dot Cellular Automata
Embedded Systems Design Techniques
Advanced Memory and Neural Computing
Low-power high-performance VLSI design
Quantum and electron transport phenomena
Cloud Computing and Resource Management
Distributed systems and fault tolerance
Advancements in Semiconductor Devices and Circuit Design
Cellular Automata and Applications
Semiconductor materials and devices
Graph Theory and Algorithms
Ferroelectric and Negative Capacitance Devices
Logic, programming, and type systems
Algorithms and Data Compression
Scientific Computing and Data Management
Data Management and Algorithms
Big Data and Business Intelligence
Quantum Computing Algorithms and Architecture
Computability, Logic, AI Algorithms
Quantum Information and Cryptography
Software System Performance and Reliability

University of Notre Dame
2015-2024

The Aerospace Corporation
2021

DELL (United States)
2016

Notre Dame University
2007

Sandia National Laboratories
2004

Rho (United States)
1994-2002

IBM (United States)
1973-1992

Stanford University
1973

A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations

OPENALEX - Publications

Peter M. Kogge Harold S. Stone

An mth-order recurrence problem is defined as the computation of series x <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> , xmlns:xlink="http://www.w3.org/1999/xlink">2</inf> ..., X xmlns:xlink="http://www.w3.org/1999/xlink">N</inf> where xmlns:xlink="http://www.w3.org/1999/xlink">i</inf> = f (x xmlns:xlink="http://www.w3.org/1999/xlink">i-1</inf> xmlns:xlink="http://www.w3.org/1999/xlink">i-m</inf> ) for some function . This paper uses...

10.1109/tc.1973.5009159 article EN IEEE Transactions on Computers 1973-08-01

Mapping irregular applications to DIVA, a PIM-based data-intensive architecture

OPENALEX - Publications

Mary Hall Peter M. Kogge J.G. Koller Pedro C. Diniz Jacqueline Chame and 9 more

Article Free Access Share on Mapping irregular applications to DIVA, a PIM-based data-intensive architecture Authors: Mary Hall USC Information Sciences Institute, Marina del Rey, CA CAView Profile , Peter Kogge University of Notre Dame, IN INView Jeff Koller Pedro Diniz Jacqueline Chame Draper LaCoss John Granacki Jay Brockman Apoorv Srivastava William Athas Vincent Freeh Jaewook Shin Joonseok Park Authors Info & Claims SC '99: Proceedings the 1999 ACM/IEEE conference SupercomputingJanuary...

10.1145/331532.331589 article EN 1999-01-01

EXECUBE-A New Architecture for Scaleable MPPs

OPENALEX - Publications

Peter M. Kogge

The EXECUBE chip is a new single part type building block for MPP systems that scales seamlessly from few chips (with hundred mips) to thousands of with petaop potential. Further, the architecture supports directly both SIMD and MIMD modes processing, permitting not only best current parallel computing but also possible more conventional designs. This paper discusses overall chip, computational model it represents, some comparisons against state art, how might be used real applications,...

10.1109/icpp.1994.108 article EN 1994-08-01

The energy complexity of register files

OPENALEX - Publications

Victor Zyuban Peter M. Kogge

Register files (RF) represent a substantial portion of the energy budget in modern processors, and are growing rapidly with trend towards wider instruction issue. The actual access costs depend greatly on register file circuitry used. This paper compares various RF techniques for their ef- ficiencies, as function architectural parameters such number registers ports. Port Priority Selection technique was found to be most efficient. dependence upon technology scaling is also studied. However,...

10.1145/280756.280943 article EN 1998-01-01

Inherently lower-power high-performance superscalar architectures

OPENALEX - Publications

Victor Zyuban Peter M. Kogge

In recent years, reducing power has become an important design goal for high-performance microprocessors. This work attempts to bring the issue earliest phases of microprocessor development, in particular, stage defining a chip microarchitecture. We investigate power-optimization techniques superscalar microprocessors at microarchitecture level that do not compromise performance. First, major targets reduction are identified within microarchitecture, where is heavily consumed or will be...

10.1109/12.910816 article EN IEEE Transactions on Computers 2001-03-01

Problems in designing with QCAs: Layout = Timing

OPENALEX - Publications

Michael Niemier Peter M. Kogge

The quantum cellular automata (QCA) is currently being investigated as an alternative to CMOS VLSI. While some simple logical circuits and devices have been studied, little if any work has done in considering the architecture for systems of QCA devices. This discusses progress one first such efforts. Namely, design dataflow components a microprocessor designed exclusively are discussed. Problems associated with initial designs enumerated solutions these problems (usually stemming from...

10.1002/1097-007x(200101/02)29:1<49::aid-cta132>3.0.co;2-1 article EN International Journal of Circuit Theory and Applications 2001-01-01

Parallel Solution of Recurrence Problems

OPENALEX - Publications

Peter M. Kogge

An mth-order recurrence problem is defined as the computation of sequence x <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> , ···, xmlns:xlink="http://www.w3.org/1999/xlink">N</inf> where xmlns:xlink="http://www.w3.org/1999/xlink">i</inf> = ƒ(a xmlns:xlink="http://www.w3.org/1999/xlink">i−1</inf> xmlns:xlink="http://www.w3.org/1999/xlink">i−m</inf> ) and a some vector parameters. This paper investigates general algorithms for solving...

10.1147/rd.182.0138 article EN IBM Journal of Research and Development 1974-03-01

Exascale Computing Trends: Adjusting to the &#x0022;New Normal&#x0022;' for Computer Architecture

OPENALEX - Publications

Peter M. Kogge John Shalf

We now have 20 years of data under our belt about the performance supercomputers against at least a single floating-point benchmark from dense linear algebra. Until 2004, model parallel programming, bulk synchronous using MPI model, was sufficient to permit translation into reasonable programs for more complex applications. Starting in however, confluence events changed forever architectural landscape that underpinned MPI. The first half this article goes underlying reasons these changes,...

10.1109/mcse.2013.95 article EN Computing in Science & Engineering 2013-10-16

On the Memory Access Patterns of Supercomputer Applications: Benchmark Selection and Its Implications

OPENALEX - Publications

Richard C. Murphy Peter M. Kogge

This paper compares the system performance evaluation cooperative (SPEC) Integer and Floating-Point suites to a set of real-world applications for high-performance computing at Sandia National Laboratories. These focus on high-end scientific engineering domains; however, techniques presented in this are applicable any application domain. The compared terms three memory properties: 1) temporal locality (or reuse over time), 2) spatial use data "near" that has already been accessed), 3)...

10.1109/tc.2007.1039 article EN IEEE Transactions on Computers 2007-06-01

The tops in flops

OPENALEX - Publications

Peter M. Kogge

Supercomputers are now running our search engines and social networks.Modern supercomputers based on groups of tightly interconnected microprocessors. In recent years, have shaped daily lives more directly.

10.1109/mspec.2011.5693074 article EN IEEE Spectrum 2011-01-21

Highly Scalable Near Memory Processing with Migrating Threads on the Emu System Architecture

OPENALEX - Publications

Timothy J. Dysart Peter M. Kogge Martin M. Deneroff Eric Bovell Preston Briggs and 12 more

There is growing evidence that current architectures do not well handle cache-unfriendly applications such as sparse math operations, data analytics, and graph algorithms. This due, in part, to the irregular memory access patterns demonstrated by these applications, how remote accesses are handled. paper introduces a new, highly-scalable PGAS memory-centric system architecture where migrating threads travel they access. Scaling both capacities number of cores can be largely invisible...

10.1109/ia3.2016.007 article EN 2016-11-01

Highly scalable near memory processing with migrating threads on the emu system architecture

OPENALEX - Publications

Timothy J. Dysart Peter M. Kogge Martin M. Deneroff Eric Bovell Preston Briggs and 12 more

10.5555/3018843.3018845 article EN Irregular Applications: Architectures and Algorithms 2016-11-13

A design of and design tools for a novel quantum dot based microprocessor

OPENALEX - Publications

Michael Niemier Michael J. Kontz Peter M. Kogge

Despite the seemingly endless upw ards spiral of modern VLSI technology, many experts are predicting a hard w all for CMOS in about decade. Given this, researc hers con tin ue to look at alternative technologies, one which is based on quan tumdots, called tumcellular automata (QCA). While first such devices have been fabricated, little kno wn how design complete systems them. This paper summarizes studies, namely an attempt complete, albeit simple, CPU technology. T o theoretical QCA...

10.1145/337292.337398 article EN Proceedings of the 40th conference on Design automation - DAC '03 2000-01-01

Exploring and exploiting wire-level pipelining in emerging technologies

OPENALEX - Publications

Michael Niemier Peter M. Kogge

Pipelining is a technique that has long since been considered fundamental by computer architects. However, the world of nanoelectronics pushing idea pipelining to new and lower levels — particularly device level. How this affects circuits relationship between their timing, architecture, design will be studied in context an inherently self-latching nanotechnology termed Quantum Cellular Automata (QCA). Results indicate offers potential for “free” multi-threading “processing-in-wire”. All...

10.1145/379240.379261 article EN 2001-01-01

Quantum-Dot Cellular Automata (QCA) circuit partitioning

OPENALEX - Publications

Dominic Antonelli Danny Z. Chen Timothy J. Dysart Xiaobo Sharon Hu Andrew B. Kahng and 3 more

This paper presents the Quantum-Dot Cellular Automata (QCA) physical design problem, in context of VLSI problem. The problem is divided into three subproblems: partitioning, placement, and routing QCA circuits. an ILP formulation heuristic solution to partitioning compares two sets results. Additionally, we compare a human-generated circuit Heuristic solutions. results demonstrate that practical method reducing run time while providing result close optimal for given circuit.

10.1145/996566.996671 article EN 2004-06-07

Optimization of high-performance superscalar architectures for energy efficiency

OPENALEX - Publications

Victor Zyuban Peter M. Kogge

In recent years reducing power has become a critical design goal for high-performance microprocessors. This work attempts to bring the issue earliest phase of microprocessor development. We propose methodology power-optimization at micro-architectural level. First, major targets reduction are identified within superscalar microarchitecture, then an optimization micro-architecture is performed that generates set energy-efficient configurations forming convex hull in power-performance space....

10.1145/344166.344522 article EN 2000-01-01

Combined DRAM and logic chip for massively parallel systems

OPENALEX - Publications

Peter M. Kogge T. Sunaga H. Miyataka Koji Kitamura E. Retter

A new 5 V 0.8 /spl mu/m CMOS technology merges 100 K custom circuits and 4.5 Mb DRAM onto a single die that supports both high density memory significant computing logic. One of the first chips built with this implements unique Processor-In-Memory (PIM) computer architecture termed EXECUBE has 8 separate 25 MHz CPU macros 16 32 K/spl times/9 b on die. These are organized together to provide part type for scaleable massively parallel processing applications, particularly embedded ones where...

10.1109/arvlsi.1995.515607 article EN 2002-11-19

Using the TOP500 to trace and project technology and architecture trends

OPENALEX - Publications

Peter M. Kogge Timothy J. Dysart

The TOP500 is a treasure trove of information on the leading edge high performance computing. It was used in 2008 DARPA Exascale technology report to isolate out effects architecture and computing, lay groundwork project how current systems might mature through coming years. Two particular classes architectures were identified: "heavy-weight" (based end commodity microprocessors) "lightweight," (primarily BlueGene variants), projections made performance, concurrency, memory capacity, power....

10.1145/2063384.2063421 article EN 2011-11-08

Partially Reversible Pipelined QCA Circuits: Combining Low Power With High Throughput

OPENALEX - Publications

Marco Ottavi Salvatore Pontarelli Erik P. DeBenedictis A. Salsano Sarah Frost-Murphy and 2 more

This paper introduces an architecture for quantum-dot cellular automata circuits with the potential high throughput and low power dissipation. The combination of regions Bennett clocking memory storage combines advantage reversible computing pipelining. Two case studies are initially presented to evaluate proposed pipelined in terms consumption due information A general model assessing is also proposed. shows that advantages possible by using a scheme depend on circuit topology, thus...

10.1109/tnano.2011.2147796 article EN IEEE Transactions on Nanotechnology 2011-05-09

Logic in wire: using quantum dots to implement a microprocessor

OPENALEX - Publications

Michael Niemier Peter M. Kogge

Despite the seemingly endless upwards spiral of modern VLSI technology many experts are predicting a hard wall for CMOS in about decade. Given this, researchers continue to look at alternative technologies, one which is based on quantum dots, called cellular automata. While first such devices have been fabricated, little known how design complete systems. This paper summarizes studies, namely an attempt complete, albeit simple, CPU technology. The projections striking: projected 10 1...

10.1109/glsv.1999.757390 article EN 2003-01-20

Pursuing a petaflop: point designs for 100 TF computers using PIM technologies

OPENALEX - Publications

Peter M. Kogge S. Bass Jay Brockman D.Z. Chen Edwin H.‐M. Sha

This paper is a summary of proposal submitted to the NSF 100 Tera Flops Point Design Study. Its main thesis that use Processing-In-Memory (PIM) technology can provide an extremely dense and highly efficient base on which such computing systems be constructed describes strawman organization one potential PIM chip, along with how multiple chips might organized into real system, what software supporting system look like, several applications we will attempting place onto system.

10.1109/fmpc.1996.558065 article EN 2002-12-23

Coming Soon ...