Jordà Polo

ORCID: 0000-0001-5422-7890
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Cloud Computing and Resource Management
  • Advanced Data Storage Technologies
  • Algorithms and Data Compression
  • Advanced Neural Network Applications
  • IoT and Edge/Fog Computing
  • Video Surveillance and Tracking Methods
  • Parallel Computing and Optimization Techniques
  • Genomics and Phylogenetic Studies
  • Distributed and Parallel Computing Systems
  • DNA and Biological Computing
  • Advanced Image and Video Retrieval Techniques
  • Evolutionary Algorithms and Applications
  • Software-Defined Networks and 5G
  • Machine Learning in Materials Science
  • Graph Theory and Algorithms
  • Gene expression and cancer classification
  • Visual Attention and Saliency Detection
  • Distributed systems and fault tolerance
  • Software System Performance and Reliability
  • Topic Modeling
  • Nuclear Materials and Properties
  • Clinical Laboratory Practices and Quality Control
  • CRISPR and Genetic Engineering
  • Image Enhancement Techniques
  • Generative Adversarial Networks and Image Synthesis

Universitat Politècnica de Catalunya
2013-2022

Barcelona Supercomputing Center
2010-2022

MapReduce is a data-driven programming model proposed by Google in 2004 which especially well suited for distributed data analytics applications. We consider the management of applications an environment where multiple share same physical resources. Such sharing line with recent trends center aim to consolidate workloads order achieve cost and energy savings. In shared environment, it necessary predict manage performance given set goals defined them. this paper, we address problem...

10.1109/noms.2010.5488494 article EN 2010-01-01

Microservices architecture has started a new trend for application development number of reasons: (1) to reduce complexity by using tiny services; (2) scale, remove and deploy parts the system easily; (3) improve flexibility use different frameworks tools; (4) increase overall scalability; (5) resilience system. Containers have empowered usage microservices architectures being lightweight, providing fast start-up times, having low overhead. can be used develop applications based on...

10.1109/nca.2015.49 preprint EN 2015-09-01

Autoscaling methods are used for cloud-hosted applications to dynamically scale the allocated resources guaranteeing Quality-of-Service (QoS). The public-facing application serves dynamic workloads, which contain bursts and pose challenges autoscaling ensure performance. Existing State-of-the-art burst-oblivious determine provision appropriate resources. For it is hard detect handle online maintaining In this article, we propose a novel burst-aware method detects burst in workloads using...

10.1109/tsc.2020.2995937 article EN IEEE Transactions on Services Computing 2020-05-20

Recent advances in hardware, such as systems with multiple GPUs and their availability the cloud, are enabling deep learning various domains including health care, autonomous vehicles, Internet of Things. Multi-GPU exhibit complex connectivity among between CPUs. Workload schedulers must consider hardware topology workload communication requirements order to allocate CPU GPU resources for optimal execution time improved utilization shared cloud environments.

10.1145/3126908.3126933 article EN 2017-11-08

Next generation data centers will be composed of thousands hybrid systems in an attempt to increase overall cluster performance and minimize energy consumption. New programming models, such as MapReduce, specifically designed make the most very large infrastructures leveraged develop massively distributed services. At same time, bring unprecedented degree workload consolidation, hosting infrastructure services from many different users. In this paper we present our advancements leveraging...

10.1109/icpp.2010.73 article EN 2010-09-01

This paper presents a scheduling technique for multi-job MapReduce workloads that is able to dynamically build performance models of the executing workloads, and then use these purposes. ability leveraged adaptively manage workload while observing taking advantage particulars execution environment modern data analytics applications, such as hardware heterogeneity distributed storage. The targets highly dynamic in which new jobs can be submitted at any time, share physical resources with...

10.1109/tnsm.2012.122112.110163 article EN IEEE Transactions on Network and Service Management 2013-01-09

The recent upsurge in the available amount of health data and advances next-generation sequencing are setting ground for long-awaited precision medicine. To process this deluge data, bioinformatics workloads becoming more complex computationally demanding. For reasons they have been extended to support different computing architectures, such as GPUs FPGAs, leverage form parallelism typical each architectures. paper describes how a genomic workload k-mer frequency counting that takes...

10.1016/j.future.2018.11.028 article EN cc-by Future Generation Computer Systems 2018-11-22

Abstract Modern applications demand resources at an unprecedented level. In this sense, data-centers are required to scale efficiently cope with such demand. Resource disaggregation has the potential improve resource-efficiency by allowing deployment of workloads in more flexible ways. Therefore, industry is shifting towards disaggregated architectures, which enables new ways structure hardware data centers. However, determining best performing resource provisioning a complicated task. The...

10.1186/s13677-021-00238-6 article EN cc-by Journal of Cloud Computing Advances Systems and Applications 2021-03-06

The emergence of Next Generation Sequencing (NGS) platforms has increased the throughput genomic sequencing and in turn amount data that needs to be processed, requiring highly efficient computation for its analysis. In this context, modern architectures including accelerators non-volatile memory are essential enable mass exploitation these bioinformatics workloads. This paper presents a redesign main component state-of-the-art reference-free method variant calling, SMUFIN, which been...

10.1109/hpcc-smartcity-dss.2017.57 article EN 2017-12-01

We present our work on developing and training scalable graph foundation models (GFM) using HydraGNN, a multi-headed convolutional neural network architecture. HydraGNN expands the boundaries of (GNN) in both scale data diversity. It abstracts over message passing algorithms, allowing reproduction comparison across algorithmic innovations that define convolution GNNs. This discusses series optimizations have allowed scaling up GFM to tens thousands GPUs datasets consist hundreds millions...

10.48550/arxiv.2406.12909 preprint EN arXiv (Cornell University) 2024-06-12

Conditional Restricted Boltzmann Machine (CRBM) is a promising candidate for multidimensional system modeling that can learn probability distribution over set of data. It specific type an artificial neural network with one input (visible) and output (hidden) layer. Recently published works demonstrate CRBM suitable mechanism time series such as human motion, workload characterization, city traffic analysis. The process learning inference these systems relies on linear algebra functions like...

10.1016/j.future.2019.10.025 article EN cc-by Future Generation Computer Systems 2019-11-02

In this paper we present a MapReduce task scheduler for shared environments in which is executed along with other resource-consuming workloads, such as transactional applications. All workloads may potentially share the same data store, some of them consuming analytics purposes while others acting generators. This kind scenario becoming increasingly important centers where improved resource utilization can be achieved through workload consolidation, and specially challenging due to...

10.1109/ccgrid.2014.65 article EN 2014-05-01

As the adoption of Big Data technologies becomes norm in an increasing number scenarios, there is also a growing need to optimize them for modern processors. Spark has gained momentum over last few years among companies looking high performance solutions that can scale out across different cluster sizes. At same time, processors be connected large amounts physical memory, range up terabytes. This opens enormous opportunities runtimes and applications aim improve their by leveraging low...

10.1109/bigdataservice.2018.00015 article EN 2018-03-01

Smith-Waterman algorithm is primarily used in DNA and protein sequencing which helps by a local sequence alignment to determine similarities between biomolecule sequences. However the inefficiency performance of this limits its applications real world. In perspective, work presents two fold contributions. It develops evaluates mathematical model for targeting distributed processing system. This can be helpful estimate larger size sequences aligned thread level parallelism, using large set...

10.1109/ibcast.2016.7429876 article EN 2022 19th International Bhurban Conference on Applied Sciences and Technology (IBCAST) 2016-01-01

Summary Powered by deep learning, video analytic applications process millions of camera feeds in real‐time to extract meaningful information from their surroundings. And this number grows the minute. To avoid saturating backhaul network and provide lower latencies, a distributed heterogeneous edge cloud is postulated as key enabler for widespread analytics. This article provides complete characterization end‐to‐end analytics across set hardware platforms different neural architectures. Each...

10.1002/cpe.6317 article EN cc-by-nc Concurrency and Computation Practice and Experience 2021-05-07

Serverless computing is a cloud-based execution paradigm that allows provisioning resources on-demand, freeing developers from infrastructure management and operational concerns. It typically involves deploying workloads as stateless functions take no when not in use, meant to scale transparently. To make serverless effective, providers impose limits on per-function level, such maximum duration, fixed amount of memory, persistent local storage. These constraints it challenging for...

10.1109/cloud53861.2021.00064 article EN 2021-09-01

Current distributed key-value stores generally provide greater scalability at the expense of weaker consistency and isolation. However, additional isolation support is becoming increasingly important in environments which these are deployed, where different kinds applications with needs executed, from transactional workloads to data analytics. While fully-fledged ACID may not be feasible, it still possible take advantage design stores, often include notion multiversion concurrency control,...

10.1109/nca.2013.42 article EN 2013-08-01

Disaggregation of resources is a datacenter strategy that aims to decouple the physical location from place where they are accessed, as opposed physically attached devices connected Peripheral Component Interconnect Express (PCIe) bus. By attaching and detaching through fast interconnection network, it possible increase flexibility manage infrastructures while keeping performance pooled disaggregated devices. This article introduces workload scheduling placement policies for environments...

10.1109/jsyst.2021.3090306 article EN IEEE Systems Journal 2021-07-07
Coming Soon ...