- Data Management and Algorithms
- Advanced Database Systems and Queries
- DNA and Biological Computing
- Distributed and Parallel Computing Systems
- Algorithms and Data Compression
- Advanced Data Storage Technologies
- Scientific Computing and Data Management
- Advanced biosensing and bioanalysis techniques
- Advanced Image and Video Retrieval Techniques
- Ovarian function and disorders
- Reproductive Biology and Fertility
- Service-Oriented Architecture and Web Services
- Data Mining Algorithms and Applications
- Time Series Analysis and Forecasting
- Graph Theory and Algorithms
- Parallel Computing and Optimization Techniques
- Business Process Modeling and Analysis
- Visual Attention and Saliency Detection
- Virtual Reality Applications and Impacts
- Advanced Clustering Algorithms Research
- Genomics and Phylogenetic Studies
- Data Visualization and Analytics
- Advanced Memory and Neural Computing
- Advanced Software Engineering Methodologies
- Environmental DNA in Biodiversity Studies
Imperial College London
2016-2025
University College London
2022
École Polytechnique Fédérale de Lausanne
2010-2016
École Normale Supérieure - PSL
2012
ETH Zurich
2006-2008
Syngenta (Switzerland)
2001
Abstract Infertility affects 1-in-6 couples, with repeated intensive cycles of assisted reproductive technology (ART) required by many to achieve a desired live birth. In ART, typically, clinicians and laboratory staff consider patient characteristics, previous treatment responses, ongoing monitoring determine decisions. However, the reproducibility, weighting, interpretation these characteristics are contentious, highly operator-dependent, resulting in considerable reliance on clinical...
Abstract Infertility affects one-in-six couples, often necessitating in vitro fertilization treatment (IVF). IVF generates complex data, which can challenge the utilization of full richness data during decision-making, leading to reliance on simple ‘rules-of-thumb’. Machine learning techniques are well-suited analyzing provide data-driven recommendations improve decision-making. In this multi-center study ( n = 19,082 treatment-naive female patients), including 11 European centers, we...
Data lineage and data provenance are key to the management of scientific data. Not knowing exact processing pipeline used produce a derived set often renders useless from point view. On positive side, capturing information is facilitated by widespread use workflow tools for The process describes all steps involved in producing given and, hence, captures its lineage. negative efficiently storing querying based not trivial. All existing solutions recursive queries even tables represent...
In this paper we present the design and evaluate performance of an autonomic workflow execution engine. Although there exist many distributed engines, in practice, it remains a difficult problem to deploy such systems optimal configuration. Furthermore, when facing unpredictable workload with high variability, manual reconfiguration is not option. Thanks its controller, engine features self-configuration, self-tuning self-healing properties. The runs on cluster computers using tuple space...
Efficient spatial joins are pivotal for many applications and particularly important geographical information systems or the simulation sciences where scientists work with models. Past research has primarily focused on disk-based joins; efficient in-memory approaches, however, two reasons: a) main memory grown so large that datasets fit in it b) join is a very time-consuming part of all joins.
Semantic trajectory pattern mining is becoming more and important with the rapidly growing volumes of semantically rich data. Extracting sequential patterns in semantic trajectories plays a key role understanding behaviour human movement, which can widely be used many applications such as location-based advertising, road capacity optimisation, urban planning. However, most existing works on focus entire spatial area, leading to missing some locally significant within region. Based this...
In recent years, Oxford Nanopore Technologies (ONT) has gained substantial attention across various domains of nucleic acids’ research, owing to its unique advantages over other sequencing platforms. Originally developed for long-read sequencing, ONT technology evolved, with advancements enhancing applicability beyond long reads include short, synthetic DNA-based applications. However, short DNA fragments nanopore often results in lower data quality, likely due a lack protocols optimised...
Learned Index Structures (LIS) have significantly advanced data management by leveraging machine learning models to optimize indexing. However, designing these structures often involves critical trade-offs, making it challenging for both designers and end-users find an optimal balance tailored specific workloads scenarios. While some indexes offer adjustable parameters that demand intensive manual tuning, others rely on fixed configurations based heuristic auto-tuners or expert knowledge,...
Phase imaging is gaining importance due to its applications in fields like biomedical and material characterization. In applications, it can provide quantitative information missing label-free microscopy modalities. One of the most prominent methods phase quantification Transport-of-Intensity Equation (TIE). TIE often requires multiple acquisitions at different defocus distances, which not always feasible a clinical setting hardware constraints. To address this issue, we propose use...
In recent years, Oxford Nanopore Technologies (ONT) has gained substantial attention across various domains of nucleic acid research, owing to its unique advantages over other sequencing platforms. Originally developed for long-read sequencing, ONT technology evolved, with advancements enhancing applicability beyond long reads include short, synthetic DNA-based applications. However, short DNA fragments nanopore often results in lower data quality, likely due the absence protocols optimised...
Scientific WorkFlows (SWFs) need to utilize components and applications in order satisfy the requirements of specific workflow tasks. Technology trends software development signify a move from component-based service-oriented approach, therefore SWF will inevitably appropriate tools discover integrate heterogeneous services. In this paper we present SODIUM platform consisting set languages as well related middleware, for execution scientific workflows composed
Neuroscientists increasingly use computational tools in building and simulating models of the brain. The amounts data involved these simulations are immense efficiently managing this is key. One particular problem analyzing scalable execution range queries on spatial Known indexing approaches do not perform well even today's small which represent a fraction brain, containing only few millions densely packed elements. current that with increasing level detail models, also overlap tree...
An increasing amount of Web services are being implemented using process management tools and languages (BPML, BPEL, etc.). The main advantage processes is that designers can express complex business conversations at a high level abstraction, even reusing standardized protocols. downside the infrastructure behind service becomes more complex. This particularly critical for may be subjected to variability in demand suffer from unpredictable peaks heavy load. In this paper we present flexible...
Today's scientists are quickly moving from in vitro to silico experimentation: they no longer analyze natural phenomena a petri dish, but instead build models and simulate them. Managing analyzing the massive amounts of data involved simulations is major task. Yet, lack tools efficiently work with this size. One problem many share analysis spatial build. For several types need interactively follow structures model, e.g., arterial tree, neuron fibers, etc., issue range queries along way. Each...
Machine learned models have recently been suggested as a rival for index structures such B-trees and hash tables. An optimized potentially has significantly smaller memory footprint compared to its algorithmic counterparts, which alleviates the relatively high computational complexity of ML models. One unexplored aspect structures, however, is handling updates data hence model. In this paper we therefore discuss their implications Moreover, suggest method eliminating drift - error caused by...
Mutual awareness of visual attention is crucial for successful collaboration. Previous research has explored various ways to represent attention, such as field-of-view visualizations and cursor based on eye-tracking, but these methods have limitations. Verbal communication often utilized a complementary strategy overcome disadvantages. This paper proposes novel method that combines verbal with the Cone Vision improve gaze inference mutual in VR. We conducted within-group study pairs...
One of the most challenging tasks for database administrator is to physically design attain optimal performance a given workload. Physical hard because it requires selection an set features from vast search space. There have been many commercial tools available automatically suggest physical design, queries. These are, however, based on greedy heuristic pruning, which reduces their usefulness. Furthermore, they are not interactive, as APIs simulate indexes and tables product specific hidden...
Mutual awareness of visual attention is essential for collaborative work. In the field virtual environments (CVE), it has been proposed to use Field-of-View (FoV) frustum visualisations as a cue support mutual during collaboration. Recent studies on FoV focus asymmetric collaboration with AR/VR hardware setups and 3D reconstructed environments. contrast, we general-purpose CVEs (i.e., VR shared offices), whose popularity increasing due availability low-cost headsets, restrictions imposed by...
Efficiently querying multiple spatial data sets is a growing challenge for scientists. Astronomers query that contain different types of stars (e.g., dwarfs, giants, stragglers) while neuroscientists model aspects the brain in same space neurons, synapses, blood vessels). The results each determine combination to be queried next. Not knowing priori makes it hard choose an efficient indexing strategy.