- Scientific Computing and Data Management
- Distributed and Parallel Computing Systems
- Advanced Data Storage Technologies
- Research Data Management Practices
- Parallel Computing and Optimization Techniques
- Cloud Computing and Resource Management
- Geological Modeling and Analysis
- Seismology and Earthquake Studies
- Scottish History and National Identity
- Seismic Imaging and Inversion Techniques
- Data Quality and Management
- Historical Studies of British Isles
- Web Data Mining and Analysis
- Software Engineering Research
- Semantic Web and Ontologies
- Advanced Database Systems and Queries
- Business Process Modeling and Analysis
- Big Data and Business Intelligence
- Service-Oriented Architecture and Web Services
- Geographic Information Systems Studies
- Reservoir Engineering and Simulation Methods
- Computational Physics and Python Applications
- Topic Modeling
- Cultural Industries and Urban Development
- Natural Language Processing Techniques
University of Edinburgh
2013-2024
University of St Andrews
2022-2024
El Paso Community College
2024
Universidad Autónoma de Madrid
2022
Universidad Complutense de Madrid
2022
Edinburgh College
2019-2021
Heriot-Watt University
2021
British Geological Survey
2016-2019
Engineering and Physical Sciences Research Council
2018
Universidad Carlos III de Madrid
2005-2011
Abstract Background Cloud computing is a new paradigm that changing how enterprises, institutions and people understand, perceive use current software systems. With this paradigm, the organizations have no need to maintain their own servers, nor host software. Instead, everything moved cloud provided on demand, saving energy, physical space technical staff. Cloud-based system architectures provide many advantages in terms of scalability, maintainability massive data processing. Methods We...
The landscape of workflow systems for scientific applications is notoriously convoluted with hundreds seemingly equivalent systems, many isolated research claims, and a steep learning curve. To address some these challenges lay the groundwork transforming workflows development, WorkflowsRI ExaWorks projects partnered to bring international community together. This paper reports on discussions findings from two virtual "Workflows Community Summits" (January April, 2021). overarching goals...
Scientific workflows have been used almost universally across scientific domains, and underpinned some of the most significant discoveries past several decades. Many these high computational, storage, and/or communication demands, thus must execute on a wide range large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) platforms. These executions be managed using software infrastructure. Due popularity workflows, workflow management systems (WMSs)...
We present Asterism, an open source data-intensive framework, which combines the strengths of traditional workflow management systems with new parallel stream-based dataflow to run applications across multiple heterogeneous resources, without users having to: re-formulate their methods according different enactment engines; manage data distribution systems; parallelize methods; co-place and schedule computing resources; store transfer large/small volumes data. also Data-Intensive workflows...
We present dispel4py a versatile data-intensive kit presented as standard Python library.It empowers scientists to experiment and test ideas using their familiar rapid-prototyping environment.It delivers mappings diverse computing infrastructures, including cloud technologies, HPC architectures specialised machines, move seamlessly into production with large-scale data loads.The are fully automated, so that the encoded analyses handling completely unchanged.The underpinning model is...
Computational environmental science applications have evolved and become more complex over the last decade. In order to cope with needs of such applications, computational methods technologies emerged support execution these on heterogeneous, distributed systems. Among them are workflow management systems as Pegasus. Pegasus is being used by researchers model seismic wave propagation, discover new celestial objects, study RNA critical human brain development, investigate other important...
Scientific workflows are a cornerstone of modern scientific computing, and they have underpinned some the most significant discoveries last decade. Many these high computational, storage, and/or communication demands, thus must execute on wide range large-scale platforms, from large clouds to upcoming exascale HPC platforms. Workflows will play crucial role in data-oriented post-Moore's computing landscape as democratize application cutting-edge research techniques, computationally intensive...
This paper presents an optimization of MPI communication, called Adaptive-CoMPI, based on runtime compression messages exchanged by applications. The technique developed can be used for any application, because its implementation is transparent the user, and integrates different algorithms both collective point-to-point primitives. Furthermore, turned off most appropriate are selected at runtime, depending characteristics each message, network behavior, algorithm following a adaptive...
The VERCE project has pioneered an e-Infrastructure to support researchers using established simulation codes on high-performance computers in conjunction with multiple sources of observational data. This is accessed and organised via the science gateway that makes it convenient for seismologists use these resources from any location Internet. Their data handling made flexible scalable by two Python libraries, ObsPy dispel4py services delivered ORFEUS EUDAT. Provenance driven tools enable...
In the last 20 years quite a few mature workflow engines and editors have been developed to support communities in managing workflows. While there is trend followed by providers of ease creation workflows tailored their specific system, management tools still often necessitate much understanding concepts languages. This paper describes approach targeting various systems building single user interface for editing monitoring under consideration aspects such as optimization provenance data. The...
This paper introduces Laminar, a novel serverless framework based on dispel4py, parallel stream-based dataflow library. Laminar efficiently manages streaming workflows and components through dedicated registry, offering seamless experience. Leveraging large lenguage models, enhances the with semantic code search, summarization, completion. contribution computing by simplifying execution of computations, managing data streams more efficiently, valuable tool for both researchers practitioners.
The effects of the Covid-19 pandemic on Creative and Cultural Industries can be difficult to quantify. Metadata about events (theatre productions, music comedy gigs, sporting fixtures, days out, more) are an untapped resource for cultural analytics that used as a proxy metric financial social impact. This article uses sample large-scale data from UK industry providers Data Thistle ask: how at scale quantify sector in particular region? We analysed changes event provision Edinburgh August...