James E. Smith

ORCID: 0000-0001-7908-1859
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Evolutionary Algorithms and Applications
  • Distributed and Parallel Computing Systems
  • Embedded Systems Design Techniques
  • Metaheuristic Optimization Algorithms Research
  • Distributed systems and fault tolerance
  • Cloud Computing and Resource Management
  • Botany, Ecology, and Taxonomy Studies
  • Botany and Plant Ecology Studies
  • Interconnection Networks and Systems
  • Mediterranean and Iberian flora and fauna
  • Low-power high-performance VLSI design
  • Advanced Multi-Objective Optimization Algorithms
  • Advanced Memory and Neural Computing
  • Neural Networks and Applications
  • Radiation Effects in Electronics
  • Aerodynamics and Fluid Dynamics Research
  • Mechanical Engineering and Vibrations Research
  • Neural dynamics and brain function
  • Advanced Database Systems and Queries
  • Plant Ecology and Taxonomy Studies
  • Solar and Space Plasma Dynamics
  • Plant and animal studies
  • Plant Diversity and Evolution

University of the West of England
2015-2024

University of Bath
2024

Franciscan Health
2024

Creative Technologies (United States)
2022-2023

The Francis Crick Institute
2022

University of Wisconsin–Madison
2005-2021

Environmental Protection Agency
2001-2021

Northern Research Station
2008-2021

University of Arizona
1997-2021

Carnegie Mellon University
2021

The performance tradeoff between hardware complexity and clock speed is studied.First, a generic superscalar pipeline defined.Then the specific areas of register renaming, instruction window wakeup selection logic, operand bypassing are analyzed.Each modeled Spice simulated for feature sizes O&m, 0.35,um, 0.18~7% Performance results trends expressed in terms issue width size.Our analysis indicates that logic as well bypass likely to be most critical future.A microarchitecture simplifies...

10.1145/264107.264201 article EN 1997-05-01

Article Free AccessA study of branch prediction strategies Share on Author: James E. Smith Control Data Corporation, Arden Hills, Minnesota MinnesotaView Profile Authors Info & Claims ISCA '98: 25 years the international symposia Computer architecture (selected papers)August 1998 Pages 202–215https://doi.org/10.1145/285930.285980Online:01 August 1998Publication History 143citation2,616DownloadsMetricsTotal Citations143Total Downloads2,616Last 12 Months286Last 6 weeks43 Get Citation AlertsNew...

10.1145/285930.285980 article EN 1998-08-01

BACKGROUND AND PURPOSE. Peer norms influence the adoption of behavior changes to reduce risk for HIV (human immunodeficiency virus) infection. By experimentally intervening at a community level modify norms, it may be possible promote generalized reductions in practices within population. METHODS. We trained persons reliably identified as popular opinion leaders among gay men small city serve change endorsers their peers. The acquired social skills making these endorsements and complied...

10.2105/ajph.81.2.168 article EN American Journal of Public Health 1991-02-01

The combination of evolutionary algorithms with local search was named "memetic algorithms" (MAs) (Moscato, 1989). These methods are inspired by models natural systems that combine the adaptation a population individual learning within lifetimes its members. Additionally, MAs Richard Dawkin's concept meme, which represents unit cultural evolution can exhibit refinement (Dawkins, 1976). In case MA's, "memes" refer to strategies (e.g., refinement, perturbation, or constructive methods, etc.)...

10.1109/tevc.2005.850260 article EN IEEE Transactions on Evolutionary Computation 2005-10-01

A virtual machine can support individual processes or a complete system depending on the abstraction level where virtualization occurs. Some VMs flexible hardware usage and software isolation, while others translate from one instruction set to another. Virtualizing component -such as processor, memory, an I/O device - at given maps its interface visible resources onto of underlying, possibly different, real system. Consequently, appears different even multiple systems. Interjecting...

10.1109/mc.2005.173 article EN Computer 2005-05-01

As the issue width of superscalar processors is increased, instruction fetch bandwidth requirements will also increase. It become necessary to multiple basic blocks per cycle. Conventional caches hinder this effort because long sequences are not always in contiguous cache locations. We propose supplementing conventional with a trace cache. This structure traces dynamic stream, so instructions that otherwise noncontiguous appear contiguous. For Instruction Benchmark Suite (IBS) and SPEC92...

10.5555/243846.243854 article EN 1996-12-02

The predictability of data values is studied at a fundamental level. Two basic predictor models are defined: Computational predictors perform an operation on previous to yield predicted next values. Examples we study stride value prediction (which adds delta value) and last performs the trivial identity value); Context Based} match recent history (context) with predict based entirely previously observed patterns. To understand potential simulations unbounded tables that immediately updated...

10.5555/266800.266824 article EN International Symposium on Microarchitecture 1997-12-01

We propose and evaluate a multi-thread memory scheduler that targets high performance CMPs. The proposed is based on concepts originally developed for network fair queuing scheduling algorithms. provides quality of service (QoS) while improving system performance. On four processor CMP running workloads containing mix applications with range bandwidth demands, the QoS to all threads in workloads, improves by an average 14% (41% best case), reduces variance threads' target utilization from .2 .0058

10.1109/micro.2006.24 article EN 2006-12-01

Superscalar processing is the latest in along series of innovations aimed at producing ever-faster microprocessors. By exploiting instruction-level parallelism, superscalar processors are capable executing more than one instruction a clock cycle. This paper discusses microarchitecture processors. We begin with discussion general problem solved by processors: converting an ostensibly sequential program into parallel one. The principles underlying this process, and constraints that must be...

10.1109/5.476078 article EN Proceedings of the IEEE 1995-01-01

Traces are dynamic instruction sequences constructed and cached by hardware. A microarchitecture organized around traces is presented as a means for efficiently executing many instructions per cycle. Trace processors exploit both control flow data hierarchy to overcome complexity architectural limitations of conventional superscalar (1) distributing execution resources based on trace boundaries (2) applying prediction at the level rather than individual branches or instructions. Three sets...

10.5555/266800.266814 article EN International Symposium on Microarchitecture 1997-12-01

A new structure for implementing data cache prefetching is proposed and analyzed via simulation. The based on a Global History Buffer that holds the most recent miss addresses in FIFO order. Linked lists within this global history buffer connect have some common property, e.g. they were all generated by same load instruction. can be used number of previously prefetch methods, as well ones. Prefetching with has two significant advantages over conventional table methods. First, use improve...

10.1109/hpca.2004.10030 article EN 2005-03-31

Microprocessors are designed to provide good average performance over a variety of workloads. This can lead inefficiencies both in power and for individual programs during phases within the same program. Microarchitectures with multi-configuration units (e.g. caches, predictors, instruction windows) able adapt dynamically program behavior enable/disable resources as needed. A key element existing configuration algorithms is adjusting phase changes. typically done by "tuning" when change...

10.1145/545214.545241 article EN ACM SIGARCH Computer Architecture News 2002-05-01

Five solutions to the precise interrupt problem in pipelined processors are described and evaluated. An is if saved process state corresponds a sequential model of program execution which one instruction completes before next begins. In processor, interrupts difficult implement because an may be initiated its predecessors have completed. The first solution forces instructions complete modify architectural order. other four allow any order, but additional hardware used, so that can restored...

10.1109/12.4607 article EN IEEE Transactions on Computers 1988-05-01

An architecture for improving computer performance is presented and discussed. The main feature of the a high degree decoupling between operand access execution. This results in an implementation which has two separate instruction streams that communicate via queues. A similar been previously proposed array processors, but context software called on to do most coordination synchronization streams. paper emphasizes features remove this burden from programmer. Performance comparisons with...

10.1145/1067649.801719 article EN ACM SIGARCH Computer Architecture News 1982-04-01

Dependences among loads and stores whose addresses are unknown hinder the extraction of instruction level parallelism during execution a sequential program. Such ambiguous memory dependences can be overcome by dependence speculation which enables load or store to speculatively executed before all preceding known. Furthermore, multiple speculative location create versions location. Program order must tracked maintain semantics. A previously proposed approach, address resolution buffer (ARB)...

10.1109/hpca.1998.650559 article EN 2002-11-27

A proposed performance model for superscalar processorsconsists of 1) a component that models the relationshipbetween instructions issued per cycle and sizeof instruction window under ideal conditions, 2)methods calculating transient penaltiesdue to branch mispredictions, cache misses,and data misses.Using trace-derived dependenceinformation, miss rates,and miss-prediction rates as inputs, canarrive at estimates typical superscalarprocessor are within 5.8% detailed simulation onaverage 13%...

10.1145/1028176.1006729 article EN ACM SIGARCH Computer Architecture News 2004-03-02

Many high performance processors predict conditional branches and consume processor resources based on the prediction. In some situations, resource allocation can be better optimized if a confidence level is assigned to branch prediction; i.e. quantity of allocated function level. To support such optimizations, we consider hardware mechanisms that partition predictions into two sets: those which are accurate relatively percentage time, low time. The objective concentrate as many...

10.5555/243846.243880 article EN International Symposium on Microarchitecture 1996-12-02
Coming Soon ...