- Radiation Effects in Electronics
- Parallel Computing and Optimization Techniques
- Low-power high-performance VLSI design
- Software Reliability and Analysis Research
- VLSI and Analog Circuit Testing
- Reliability and Maintenance Optimization
- Maritime Navigation and Safety
- Semiconductor materials and devices
- Embedded Systems Design Techniques
- Software System Performance and Reliability
- Interconnection Networks and Systems
- Data Quality and Management
- Cloud Computing and Resource Management
- Ferroelectric and Negative Capacitance Devices
- Maritime Transport Emissions and Efficiency
- Electrostatic Discharge in Electronics
- Advancements in Semiconductor Devices and Circuit Design
- Marine and fisheries research
- Big Data and Business Intelligence
- Distributed and Parallel Computing Systems
- Caching and Content Delivery
- Advanced Memory and Neural Computing
- Advanced Data Storage Technologies
- Marine and Coastal Research
- Ship Hydrodynamics and Maneuverability
Hellenic Centre for Marine Research
2023
National and Kapodistrian University of Athens
2013-2018
Fault injection on micro architectural structures modeled in performance simulators is an effective method for the assessment of microprocessors reliability early design stages. Compared to lower level fault approaches it orders magnitude faster and allows execution large portions workloads study effect faults final program output. Moreover, many important hardware components delivers accurate estimates compared analytical methods which are fast but known significantly over-estimate a...
In this paper, we present the first automated system-level analysis of multicore CPUs based on ARMv8 64-bit architecture (8-core, 28nm X-Gene 2 micro-server by AppliedMicro) when pushed to operate in scaled voltage conditions. We report detailed effects including SDCs, corrected/uncorrected errors and application/system crashes. Our study reveals large margins (that can be harnessed for energy savings) also Vmin variation among 8 cores CPU chip, 3 different chips (a nominal rated two sigma...
Early reliability assessment of hardware structures using microarchitecture level simulators can effectively guide major error protection decisions in microprocessor design. Statistical fault injection on microarchitectural modeled performance is an accurate method to measure their Architectural Vulnerability Factor (AVF) but requires excessively long campaigns obtain high statistical significance.
Cross-layer reliability is becoming the preferred solution when a concern in design of microprocessor-based system. Nevertheless, deciding how to distribute error management across different layers system very complex task that requires support dedicated frameworks for cross-layer analysis. This paper proposes SyRA, system-level early analysis framework radiation induced soft errors memory arrays systems. The exploits multi-level hybrid Bayesian model describe target and takes advantage...
In this paper, we explore the pessimistic voltage guardbands of two multicore x86-64 microprocessor chips that belong to different microarchitectures (one ultra-low power and one high-performance microprocessor), when programs are executed on individual cores CPU chips. We also examine energy temperature gains as positive effects lowering in both while preserving functional correctness programs. The behavior was examined executing 8 workloads from SPEC CPU2006 suite. Our differential...
System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes hardware faults. When not all system abstraction layers (technology, circuit, microarchitecture, software) are factored in such an model, delivered reports must be excessively pessimistic and thus lead to unacceptably expensive, over-designed systems. We propose a scalable, cross-layer methodology supporting suite tools...
Reliability assessment has always been a major concern in the design of computing systems. The results highlight and guide enhancements which trigger redesign cycles; thus early accurate reliability is profound importance. For purposes analysis, abstract models (which are available stages) typically used. These models, however, may not be completely compared to actual final design. Existing literature quantified this inaccuracy, through comparison between Register-Transfer-Level (RTL)...
Designers try to reduce the voltage margins of CPU chips gain energy without sacrificing reliable operation. Statistical analysis methods are appealing predict safe operational at system level as they do not induce area overheads and can be applied during manufacturing or after chips' release market. In this study, we present a comprehensive statistical behavior ARMv8 64-bit cores that part enterprise 8-core X-Gene 2 micro-server family when operate in scaled conditions. Our prediction...
In this paper, we present the results of our comprehensive measurement study timing and voltage guardbands in memories cores a commodity ARMv8 based micro-server. Using various synthetic micro-benchmarks, reveal how adopted margins vary among 8 CPU chip, 3 different sigma chips show prone they are to worst-case noise. addition, characterize variation 'weak' DRAM cells terms their retention time across 72 evaluate error mitigation efficacy available error-correcting codes case operation under...
Forthcoming many-core processors are expected to be highly unreliable due their high design complexity and aggressive manufacturing technology scaling. Online functional testing is an attractive low-cost error detection solution. A scheme for architectures can easily employ existing techniques from single-core microprocessors exploit the available massive parallelism reduce total test execution time. However, straightforward of programs on such parallel does not achieve maximum theoretical...
In this paper, we propose the employment of fast targeted programs (diagnostic micro-viruses) that aim to stress individually main hardware components a multicore CPU architecture which most likely determine limits voltage scaling, i.e. safe Vmin values. We describe in detail complex development process for diagnostic micro-viruses and their comprehensive validation modern hardware. The combined execution takes very short time compared regular execution, can quickly reveal cores chips at...
The explosive growth of Internet-connected devices will soon result in a flood generated data, which increase the demand for network bandwidth as well compute power to process data. Consequently, there is need more energy efficient servers empower traditional centralized Cloud data-centers emerging decentralized at Edges Cloud. In this paper, we present our approach, aims developing new class micro-servers - UniServer that exceed conservative and performance scaling boundaries by introducing...
Statistical Fault Injection on microarchitectural simulators can provide early and accurate reliability characterization for array based hardware components. Besides, fault injectors are easily configurable (facilitating many studies) orders of magnitude faster than RTL injectors, rendering them appropriate tools estimation using large realistic benchmarks. However, the throughput injection campaigns remains a bottleneck when batch must run processor (different characteristics, different...
Forthcoming technologies hold the promise of a significant increase in integration density, performance and functionality. However, dramatic change microprocessor's reliability is also expected. Developing mechanisms for early accurate estimation will save design effort, resources consequently positively impact product's time-to-market (TTM). In this paper, we propose versatile architecture-level fault injection framework, built on top state-of-the-art x86 microprocessor simulator, thorough...
Technology evolution has raised serious reliability considerations, as transistor dimensions shrink and modern microprocessors become denser more vulnerable to faults. Reliability studies have proposed a plethora of methodologies for assessing system vulnerability which, however, highly rely on traditional metrics that solely express failure rate over time. Although Failures In Time (FIT) is very strong representative metric, it may fail offer an objective comparison diverse systems, such...
Multicore architectures are employed in the majority of computing domains (general-purpose microprocessors as well specialized high-performance such network processors). Online error detection chips can employ effective techniques from single core microprocessors, however, test scheduling should be to minimize overall chip execution time which significantly increase due congestion on common hardware resources used by cores. In this paper, we analyze most important aspects online and...
Supply voltage scaling is one of the most effective techniques to reduce power consumption microprocessors. However, technology limitations such as aging and process variability enforce microprocessor designers apply pessimistic guardbands guarantee correct operation in field for any foreseeable workload. This worst-case design practice makes energy efficiency hard scale with evolution. Improving energy-efficiency requires identification chip margins through time-consuming comprehensive...
Early reliability assessment of hardware structures using microarchitecture level simulators can effectively guide major error protection decisions in microprocessor design. Statistical fault injection on microarchitectural modeled performance is an accurate method to measure their Architectural Vulnerability Factor (AVF) but requires excessively long campaigns obtain high statistical significance. We propose MeRLiN1, a methodology boost injection-based by several orders magnitude and keep...
Nowadays, the scientific community is looking for ways to understand effect of software execution on reliability a complex system when hardware layer unreliable. This paper proposes statistical analysis model able estimate considering both and system. Bayesian Networks are employed resources processor instructions program traces. They exploited investigate probability input errors alter correct behavior output program. Experimental results show that networks prove be promising model,...
Early decisions in microprocessor design require a careful consideration of the corresponding performance and reliability implications transient faults. The size organization important on-chip hardware components such as caches, register files buffers have direct impact on both resilience to soft errors execution time applications. In this paper, we employ state-of-the-art x86-64 full-system micro-architectural simulator comprehensive fault injection framework built top it deliver detailed...
A key enabler of real applications on approximate computing systems is the availability instruments to analyze system reliability, early in design cycle. Accurately measuring impact reliability any change technology, circuits, microarchitecture and software most time a multi-team multi-objective problem must be traded off against other crucial attributes (or objectives) such as power, performance cost. Unfortunately, tools models for cross-layer analysis are still at their stages compared...
Analyzing the impact of software execution on reliability a complex digital system is an increasing challenging task. Current approaches mainly rely time consuming fault injections experiments that prevent their usage in early stage design process, when fast estimations are required order to take decisions. To cope with these limitations, this paper proposes statistical analysis model based Bayesian Networks. The proposed approach able estimate considering both hardware and layer system,...
Monitoring vessel traffic on a global scale is complex and challenging task. The large number of moving vessels the complexity monitoring their position forecasting route in real-time require novel, advanced highly scalable big-data mechanisms. In this work digital twin for constant maritime situational awareness presented. described multi-layered system able to visualize real-time, based data from Automatic Identification System (AIS), while also providing forecasts future movement machine...