Julio Sahuquillo

ORCID: 0000-0001-8630-4846
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Interconnection Networks and Systems
  • Caching and Content Delivery
  • Cloud Computing and Resource Management
  • Peer-to-Peer Network Technologies
  • Embedded Systems Design Techniques
  • Distributed and Parallel Computing Systems
  • Low-power high-performance VLSI design
  • Distributed systems and fault tolerance
  • Real-Time Systems Scheduling
  • Advanced Memory and Neural Computing
  • Semiconductor materials and devices
  • Web Data Mining and Analysis
  • Recommender Systems and Techniques
  • Radiation Effects in Electronics
  • Green IT and Sustainability
  • Experimental Learning in Engineering
  • Photonic and Optical Devices
  • Optical Network Technologies
  • Software System Performance and Reliability
  • IoT and Edge/Fog Computing
  • Advanced Optical Network Technologies
  • Network Packet Processing and Optimization
  • Engineering and Information Technology

Universitat Politècnica de València
2015-2024

Universitat Politècnica de Catalunya
2002-2024

Vall d'Hebron Hospital Universitari
2018

Universitat de València
2007

Current microprocessors are based in complex designs, integrating different components on a single chip, such as hardware threads, processor cores, memory hierarchy or interconnection networks. The permanent need of evaluating new designs each these motivates the development tools which simulate system working whole. In this paper, we present Multi2Sim simulation framework, models major incoming systems, and is intended to cover limitations existing simulators. A set examples also included...

10.1109/sbac-pad.2007.17 article EN 2007-10-01

ExaNest is one of three European projects that support a ground-breaking computing architecture for exascale-class systems built upon power-efficient 64-bit ARM processors. This group share an "everything-close" and "share-anything" paradigm, which trims down the power consumption -- by shortening distance signals most data transfers as well cost footprint area installation reducing number devices needed to meet performance targets. In ExaNeSt, we will design implement: (i) physical rack...

10.1109/dsd.2016.106 article EN 2016-08-01

Achieving system fairness is a major design concern in current multicore processors. Unfairness arises due to contention the shared resources of system, such as LLC and main memory. To address this problem, many research works have proposed novel cache partitioning policies aimed at addressing without harming performance. Unfortunately, existing proposals targeting require extra hardware which makes them impractical commercial processors.Recent Intel Xeon processors feature Cache Allocation...

10.1109/pact.2017.19 article EN 2017-09-01

High-performance microprocessors, e.g., multithreaded and multicore processors, are being implemented in embedded real-time systems because of the increasing computational requirements. These complex microprocessors have two major drawbacks when they used for purposes. First, their complexity difficults calculation WCET (worst case execution time). Second, power consumption requirements much larger, which is a concern these systems. In this paper we propose novel soft power-aware scheduler...

10.1109/ipdps.2008.4536220 article EN Proceedings - IEEE International Parallel and Distributed Processing Symposium 2008-04-01

Technology projections indicate that static power will become a major concern in future generations of high-performance microprocessors. Caches represent significant percentage the overall microprocessor die area. Therefore, recent research has concentrated on reduction leakage current dissipated by caches. The variety techniques to control can be classified as non-state preserving or state preserving. Non-state off selected cache lines while place into low-power state. Drowsy caches are...

10.1145/1062261.1062321 article EN 2005-05-04

Clustering is an effective microarchitectural technique for reducing the impact of wire delays, complexity, and power requirements microprocessors. In this work, we investigate design on-chip interconnection networks clustered microarchitectures. This new class interconnects has different demands characteristics than traditional multiprocessor networks. a microarchitecture, low inter-cluster communication latency essential high performance. We propose point-to-point together with...

10.5555/645989.674314 article EN 2002-09-22

Nowadays, high performance multicore processors implement multithreading capabilities. The processes running concurrently on these are continuously competing for the shared resources, not only among cores, but also within core. While resource sharing increases utilization, interference accessing resources can strongly affect of individual and its predictability. In this scenario, process scheduling plays a key role to deal with fairness. work we present scheduler SMT multicores that...

10.1109/tc.2016.2620977 article EN IEEE Transactions on Computers 2016-10-25

The increasing popularity of cloud computing has forced providers to build economies scale meet the growing demand. Nowadays, data-centers include thousands physical machines, each hosting many virtual machines (VMs), which share main system resources, causing interference that can significantly impact on performance. Frequently, these run latency-critical workloads, whose performance is determined by tail latency, very sensitive co-running workloads. To prevent QoS violations, adopt...

10.1016/j.future.2022.08.012 article EN cc-by Future Generation Computer Systems 2022-08-17

SRAM and DRAM cells have been the predominant technologies used to implement memory in computer systems, each one having its advantages shortcomings. are faster require no refresh since reads not destructive. In contrast, provide higher density minimal leakage energy there paths within cell from Vdd ground. Recently, embedded logic-based technology, thus overcoming speed limit of typical cells.

10.1145/1669112.1669140 article EN 2009-12-12

SUMMARY A major design issue in embedded systems is reducing the power consumption because batteries have a limited energy budget. For this purpose, several techniques such as dynamic voltage and frequency scaling (DVFS) or task migration are being used. DVFS allows by selecting optimal supply, whereas achieves effect balancing workload among cores. This paper focuses on power‐aware scheduling allowing to reduce multicore implementing capabilities. To address savings, devised schedulers...

10.1002/cpe.2899 article EN Concurrency and Computation Practice and Experience 2012-07-16

To improve chip multiprocessor (CMP) performance, recent research has focused on scheduling strategies to mitigate main memory bandwidth contention. Nowadays, commercial CMPs implement multilevel cache hierarchies that are shared by several multithreaded cores. In this microprocessor design, contention points may appear along the whole hierarchy. Moreover, problem is expected aggravate in future technologies, since number of cores and hardware threads, consequently size caches increase with...

10.1109/tpds.2013.61 article EN IEEE Transactions on Parallel and Distributed Systems 2013-10-28

Current SMT (simultaneous multithreading) processors co-schedule jobs on the same core, thus sharing core resources like L1 caches. In multicores, threads also compete among themselves for uncore LLC (last level cache) and DRAM modules. Per process performance degradation over isolated execution mainly depends resource requirements contention induced by co-runners. Consequently, running processes progress at different pace. If schedulers are not aware, unpredictable time caused unfairness...

10.1109/ipdps.2015.48 article EN 2015-05-01

Web prefetching is a technique that has been researched for years to reduce the latency perceived by users. For this purpose, several architectures have used, but no comparative study performed identify best architecture dealing with prefetching. This paper analyzes impact of focusing on limits reducing user's latency. To end, factors constrain predictive power each are analyzed and these theoretical quantified. Experimental results show element locate single prediction engine proxy, whose...

10.1109/wi.2006.166 article EN 2006-12-01

Web prefetching is one of the techniques proposed to reduce user's perceived latencies in World Wide Web. The spatial locality shown by accesses makes it possible predict future based on previous ones. A engine uses these predictions prefetch objects before user demands them. existing prediction algorithms achieved an acceptable performance when they were but high increase amount embedded per page has reduced their effectiveness current In this paper we show that most made are useless...

10.1109/hotweb.2006.355260 article EN 2006-11-01

Power consumption is a major concern in today's processor design. As technology shrinks, leakage power dominates the overall of although it expected that dynamic gains relevance future semiconductor technology. This particularly relevant for cache hierarchy, which contains an important percentage microprocessor transistors. In this work we propose use phase adaptive design to reduce both and with very little impact on performance. We take advantage overwhelming preference memory accesses...

10.1109/igcc.2013.6604475 article EN 2013-06-01

Simultaneous multithreading (SMT) processors share most of the microarchitectural core components among co-running applications. The competition for shared resources causes performance interference between Therefore, benefits SMT heavily depend on complementarity Symbiotic job scheduling, i.e., scheduling applications that co-run well together a core, can have considerable impact processor with cores. Prior work uses sampling or novel hardware support to perform symbiotic which has either...

10.1109/hpca.2016.7446103 article EN 2016-03-01

In order to improve CMP performance, recent research has focused on scheduling mitigate contention produced by the limited memory bandwidth. Nowadays, commercial CMPs implement multi-level cache hierarchies where last level caches are shared at least two structures located immediately lower level. turn, these can be several multithreaded cores. this microprocessor design, points may appear along whole hierarchy. Moreover, problem is expected aggravate in future technologies, since number of...

10.1109/ipdps.2012.54 article EN 2012-05-01

Improving the utilization of shared resources is a key issue to increase performance in SMT processors. Recent work has focused on resource sharing policies enhance processor performance, but their proposals mainly concentrate novel hardware mechanisms that adapt dynamic requirements running threads. This addresses L1 cache bandwidth problem processors experimentally real hardware. Unlike previous work, this paper concentrates thread allocation, by selecting proper pair co-runners be...

10.5555/2523721.2523741 article EN 2013-10-07

The memory hierarchy plays a critical role on the performance of current chip multiprocessors. Main is shared by all running processes, which can cause important bandwidth contention. In addition, when processor implements SMT cores, L1 becomes among threads each core. such case, bandwidth-aware schedulers emerge as an interesting approach to mitigate This work investigates degradation that processes suffer due constraints. Experiments show main and contention negatively impact process...

10.1109/tc.2015.2428694 article EN IEEE Transactions on Computers 2015-05-01
Coming Soon ...