- Parallel Computing and Optimization Techniques
- Embedded Systems Design Techniques
- Interconnection Networks and Systems
- Distributed and Parallel Computing Systems
- Advanced Data Storage Technologies
- Cloud Computing and Resource Management
- Real-Time Systems Scheduling
- Distributed systems and fault tolerance
- Radiation Effects in Electronics
- Low-power high-performance VLSI design
- Cryptography and Residue Arithmetic
- Experimental Learning in Engineering
- Cryptography and Data Security
- Cryptographic Implementations and Security
- Coding theory and cryptography
- VLSI and Analog Circuit Testing
- Cellular Automata and Applications
- CCD and CMOS Imaging Sensors
- Graph Theory and Algorithms
- Digital Filter Design and Implementation
- IoT and Edge/Fog Computing
- Network Packet Processing and Optimization
- Semiconductor materials and devices
- Context-Aware Activity Recognition Systems
- Video Coding and Compression Technologies
University of Siena
2016-2025
University of Cyprus
2016
An-Najah National University
2016
University of Pisa
1997-2005
University of Aizu
2005
Association for Computing Machinery
1999-2000
In this paper, the scheduled dataflow (SDF) architecture-a decoupled memory/execution, multithreaded architecture using nonblocking threads-is presented in detail and evaluated against superscalar architecture. Recent focus field of new processor architectures is mainly on VLIW (e.g., IA-64), superscalar, superspeculative designs. This trend allows for better performance, but at expense increased hardware complexity and, possibly, higher power expenditures resulting from dynamic instruction...
Current computing systems are mostly focused on achieving performance, programmability, energy efficiency, resiliency by essentially trying to replicate the uni-core execution model n-times in parallel a multi/many-core system. This choice has heavily conditioned way both software and hardware designed nowadays. However, as old computer architecture is concept of dataflow, that "initiating an activity presence data it needs perform its function" [J. Dennis]. Dataflow had been historically...
Thanks to the improvements in semiconductor technologies, extreme-scale systems such as teradevices (i.e., composed by 1000 billion of transistors) will enable with 1000+ general purpose cores per chip, probably 2020. Three major challenges have been identified: programmability, manageable architecture design, and reliability. TERAFLUX is a Future Emerging Technology (FET) large-scale project funded European Union, which addresses at once leveraging dataflow principles. This paper describes...
The TERAFLUX project is a Future and Emerging Technologies (FET) Large-Scale Project funded by the European Union. at forefront of major research challenges such as programmability, manageable architecture design, reliability many-core or 1000+ core chips. In near future, new computing systems will consist huge number transistors - probably 1 Tera 1000 billions 2020: we name "Teradevices". this project, aim to solve three once using dataflow principles wherever they are applicable make sense...
WebRISC-V is a web-based server-side RISC-V assembly language Pipelined Datapath simulation environment, which aims at easing students learning and instructors teaching experience. an open-source Instruction Set Architecture (ISA) that highly flexible, modular, extensible royalty free. Because of these reasons, there exploding interest both in the industry academia for RISC-V. Here, we present main features this simulator how it can be used simple exercise classroom. This permits execution...
One way to exploit Thread Level Parallelism (TLP) is use architectures that implement novel multithreaded execution models, like Scheduled Data- Flow (SDF). This latter model promises an elegant decoupled and non-blocking of threads. Here we extend in order be used future scalable CMP systems where wire delay imposes partition the design. In this paper describe our approach experiment with different distributed schedulers, number clusters processors per cluster show good scalability...
We have implemented a MIPS simulation environment called WebMIPS. Our simulator is accessible from the Web and has been successfully used in introductory computer architecture course at Faculty of Information Engineering Siena, Italy. The advantages approach are immediate access to simulator, without installation, possible centralized monitoring students' activity. WebMIPS capable uploading assembling code provided by user, simulating five-stage pipeline step or completely, displaying values...
The continuous improvements offered by the silicon technology enables integration of always increasing number cores on a single chip. Following this trend, it is expected to approach microprocessor architectures composed thousands (i.e., kilo-core architectures) in next future. To cope with demand for high performance systems, many-core designs rely integrated network-on-chips deliver correct bandwidth and latency inter-core communications. In context, simulation tools represent crucial...
Embedded System toolchains are highly customized for a specific System-on-Chip (SoC). When the application needs more performance, designer is typically forced to adopt new SoC and possibly another toolchain. The rationale not scaling performance by using, e.g., two SoCs, that maintining most of operations on-chip may allow higher energy efficiency. We exploring feasibility trade-offs designing manufacturing Single Board Computer (SBC) could serve flexibly number current future applications,...
The number of cores per chip keeps increasing in order to improve performance while controlling the power. According semiconductor roadmaps, future computing systems reach scale 1 Tera devices a single package. Firstly, such Tera-device will expose large amount parallelism that cannot be easily and efficiently exploited by current applications programming models. Secondly, reliability become critical issue. Finally, we need simplify design systems. TERAFLUX aims at providing framework based...
The trend to develop increasingly more intelligent systems leads directly a considerable demand for and computational power. Programming models that aid exploit the application parallelism with current multi-core exist but limitations. From this perspective, new execution are arising surpass limitations scale up number of processing elements, while dedicated hardware can help scheduling threads in many-core systems. This paper depicts data-flow based model exposes x8664 architecture millions...
Future computing systems (Teradevices) will probably contain more than 1000 cores on a single die. To exploit this parallelism, threaded dataflow execution models are promising, since they provide side-effect free and reduced synchronization overhead. But the terascale transistor integration of such chips make them orders magnitude vulnerable to voltage fluctuation, radiation, process variations. This means reliability techniques have be an essential part future systems, too.In paper, we...
This paper describes the development and testing of a vehicle recognition prototype based on magnetic sensors. The aim this research is to design low cost, power consumption simple hardware platform for recognition. goal recognize four types vehicles (car, bus, mini-bus or camper) as they run over set We describe all steps correct presence detection, pattern preprocessing, speed length detection using combination an empirical analytical method signal alignment. collected data regarding...
People and objects will soon share the same digital network for information exchange in a world named as age of cyber-physical systems. The general expectation is that people systems interact real-time. This poses pressure onto design to support increasing demands on computational power, while keeping low power envelop. Additionally, modular scaling easy programmability are also important ensure these become widespread. whole set expectations impose scientific technological challenges need...
The AXIOM project (Agile, eXtensible, fast I/O Module) aims at researching new software/hardware architectures for the future Cyber-Physical Systems (CPSs). These systems are expected to react in real-time, provide enough computational power assigned tasks, consume least possible energy such task (energy efficiency), scale up through modularity, allow an easy programmability across performance scaling, and exploit best existing standards minimal costs.
Elliptic-Curve cryptography (ECC) is promising for enabling information security in constrained embedded devices. In order to be efficient on a target architecture, ECCs require accurate choice/tuning of the algorithms that perform underlying mathematical operations. This paper contributes with cycle-level analysis dependencies ECC performance from interaction between features and actual architectural microarchitectural an ARM-based Intel XScale processor. Another contribution modified ARM...
The ambitious challenges posed by next exascale computing systems may require a critical re-examination of both architecture design and consolidated wisdom in terms programming style execution model, because such are expected to be constituted thousands processors with cores per chip. But how build architectures remains an open question.This paper presents novel system based on configurable static dataflow model. We assume that the basic computational unit is graph. Each processing node ad...
Elliptic Curve Cryptography (ECC) is emerging as an attractive public-key system for constrained environments, because of the small key sizes and computational efficiency, while preserving same security level standard methodsWe have developed a set benchmarks to compare corresponding elliptic curve methods. An embedded device based on Intel XScale architecture, which utilizes ARM processor core was modeled used studying benchmark performance. Different possible variations memory hierarchy...