- Parallel Computing and Optimization Techniques
- Distributed and Parallel Computing Systems
- Distributed systems and fault tolerance
- Cloud Computing and Resource Management
- Advanced Data Storage Technologies
- Matrix Theory and Algorithms
- Radiation Effects in Electronics
- Simulation Techniques and Applications
- Software System Performance and Reliability
- Scientific Computing and Data Management
- Economic Theory and Policy
- Low-power high-performance VLSI design
- Embedded Systems Design Techniques
- Quantum Computing Algorithms and Architecture
- Advanced Numerical Methods in Computational Mathematics
- Interconnection Networks and Systems
- Banking stability, regulation, efficiency
- Simulation and Modeling Applications
- Numerical Methods and Algorithms
- Advanced Memory and Neural Computing
- Monetary Policy and Economic Impact
- Economic theories and models
- VLSI and FPGA Design Techniques
- Neural Networks and Applications
- scientometrics and bibliometrics research
Ansys (United States)
2020-2024
Lutheran School of Theology at Chicago
2019
University of Southern California
2009-2018
Southern California University for Professional Studies
2012-2016
Sandia National Laboratories California
2015
Marina Del Rey Hospital
2007
Integrated Systems Incorporation (United States)
2007
University of Saskatchewan
1982-2003
Intel (United States)
2003
University of Cagliari
2002
In 2003, the DARPA's High Productivity Computing Systems released HPCC suite. It examines performance of HPC architectures using kernels with various memory access patterns well known computational kernels. Consequently, results bound real applications as a function characteristics and define boundaries architectures. The suite was intended to augment TOP500 list by now are publicly available for 6 out 10 world's fastest computers. Implementations exist in most major high-end programming...
Exascale systems will provide an unprecedented opportunity for science, one that make it possible to use computation not only as a critical tool along with theory and experiment in understanding the behavior of fundamental components nature, but also advances nation’s energy needs security. To create exascale software enable US Department Energy (DOE) meet science goals energy, ecological sustainability, global security, we must focus on major architecture, software, algorithm, data...
This paper presents a new distributed multifrontal sparse matrix decomposition algorithm suitable for message passing parallel processors. The uses nested dissection ordering and distribution of the to minimize interprocessor data dependencies overcome communication bottleneck previously reported [1]. Distributed forward elimination back substitution algorithms are also provided. Results an implementation on Intel iPSC presented. Up 16 processors used solve systems with as many 7225...
As the scale and complexity of future High Performance Computing systems continues to grow, rising frequency faults errors their impact on HPC applications will make it increasingly difficult accomplish useful computation. Traditional means fault detection correction are either hardware based or use software redundancy. Redundancy approaches usually entail complete replication program state computation therefore incurs substantial overhead application performance. Therefore, wide-scale full...
This paper provides measures of average publication rates (and frequency distributions) articles and pages in all journals indexed by the Index Economics Articles over 1980s. The study covers 733 economists holding tenured or probationary appointments at Canadian universities 1989-90 academic year. All article page counts are converted to single-author-equivalent (SAE) dividing number authors. Over decade, published on one SAE every 2.5 years. also generates publications quality profession...
The continued growth in the processing power of FPGAs coupled with high bandwidth memories (HBM), makes systems like Xilinx U280 credible platforms for linear solvers which often dominate run time scientific and engineering applications. In this paper, we present Callipepla, an accelerator a preconditioned conjugate gradient solver (CG). FPGA acceleration CG faces three challenges: (1) how to support arbitrary problem terminate on fly, (2) coordinate long-vector data flow among modules, (3)...
The challenge of resilience for High Performance Computing applications is significant future extreme scale systems. These systems will experience unprecedented rates faults and errors as they be constructed from massive numbers components that are inherently less reliable than those available today. While the use redundant computing can provide detection possible correction errors, its system-wide in extreme-scale HPC incur considerable overheads to application performance. In this paper,...
This paper describes several challenges facing programmers of future edge computing systems, the diverse many-core devices that will soon exemplify commodity mainstream systems. To call attention to programming ahead, this focuses on most complex such architectures: integrated, power-conserving inherently parallel and heterogeneous, with distributed address spaces. When new concerns arise: computation partitioning across functional units, data movement synchronization, managing a diversity...
System resilience is an important challenge that needs to be addressed in the era of extreme scale computing. Exascale supercomputers will architected using millions processor cores and memory modules. As process technology scales, reliability such systems challenged by inherent unreliability individual components due extremely small transistor geometries, variability silicon manufacturing processes, device aging, etc. Therefore, errors failures increasingly norm rather than exception. Not...
In this paper, we describe a compilation system that automates much of the process performance tuning is currently done manually by application programmers interested in high performance. Due to growing complexity accurate prediction, our incorporates empirical techniques execute variants code segments with representative data on target architecture. discuss how and modeling can be effectively combined. We also role historical information from prior runs, programmer specifications supporting...
A 3-D one-carrier device solver has been developed on an Intel iPSC2 hypercube multiprocessor which can handle over 130 K nodes. CPU time averages 20 min per bias point a 50 K-node MOSFET example. Slotboom variables are used in conjunction with the Scharfetter-Gummel current discretization scheme. scaling scheme is proposed produces n, p from variables. An improved damped-Newton scheme, maintains iteration numbers at below fifteen for high gate biases, solving Poisson's equation. The...
A parallel circuit simulator was implemented on the iPSC system. Concurrent model evaluation, hierarchical BBDF (bordered block diagonal form) reordering, and distributed multifrontal decomposition to solve sparse matrix are used. speedup of six times has been achieved an eight-processor hypercube system.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
We deal with the problem of modeling production systems where inventory management is predominant respect to other aspects cycle. The model we use, called first-order hybrid Petri nets, a that combines fluid and discrete event dynamics enables us simulate dynamic concurrent activities (IMS). It also provides modular representation an IMS, thus making it useful even when dealing large dimension systems. Finally, real application case cheese factory considered: all numerical data are relative...