- Scientific Computing and Data Management
- Advanced Data Storage Technologies
- Distributed and Parallel Computing Systems
- Medical Imaging Techniques and Applications
- Cloud Computing and Resource Management
- Advanced X-ray Imaging Techniques
- Advanced X-ray and CT Imaging
- Healthcare Operations and Scheduling Optimization
- Meteorological Phenomena and Simulations
- Peer-to-Peer Network Technologies
- Machine Learning in Materials Science
- Emergency and Acute Care Studies
- Remote Sensing in Agriculture
- Machine Learning and Data Classification
- Advanced Database Systems and Queries
- Radiomics and Machine Learning in Medical Imaging
- Advanced Neural Network Applications
- Software System Performance and Reliability
- Cryospheric studies and observations
- Advanced MRI Techniques and Applications
- Distributed systems and fault tolerance
- Climate variability and models
- Stochastic Gradient Optimization Techniques
- Plant Water Relations and Carbon Dynamics
- Research Data Management Practices
Argonne National Laboratory
2017-2024
University of Chicago
2019-2024
Amazon (United States)
2024
University of Nevada, Reno
2024
University of Illinois Chicago
2020-2024
University of Houston
2024
Weifang Medical University
2024
Division of Cancer Epidemiology and Genetics
2023
Shanxi Agricultural University
2021-2023
National Cancer Institute
2023
Synchrotron-based x-ray tomography is a noninvasive imaging technique that allows for reconstructing the internal structure of materials at high spatial resolutions from tens micrometers to few nanometers. In order resolve sample features smaller length scales, however, higher radiation dose required. Therefore, limitation on achievable resolution set primarily by noise these scales. We present TomoGAN, denoising based generative adversarial networks, improving quality reconstructed images...
Abstract. This study develops a neural-network-based approach for emulating high-resolution modeled precipitation data with comparable statistical properties but at greatly reduced computational cost. The key idea is to use combination of low- and simulations (that differ not only in spatial resolution also geospatial patterns) train neural network map from the former latter. Specifically, we define two types CNNs, one that stacks variables directly encodes each variable before stacking, CNN...
Understanding the spatio-temporal changes of vegetation and its climatic control factors can provide an important theoretical basis for protection restoration eco-environments. In this study, we analyzed normalized difference index (NDVI) in Chinese Loess Plateau (CLP) from 2002 to 2018 via trend analysis, stability Mann-Kendall mutation test investigate change vegetation. addition, also used skewness analysis correlation explore contribution climate human activities on regional changes. The...
Powerful detectors at modern experimental facilities routinely collect data multiple GB/s. Online analysis methods are needed to enable the collection of only interesting subsets such massive streams, as by explicitly discarding some elements or directing instruments relevant areas space. Thus, required for configuring and running distributed computing pipelines—what we call flows—that link instruments, computers (e.g., analysis, simulation, artificial intelligence [AI] model training), edge...
Abstract A concise and measurable set of FAIR (Findable, Accessible, Interoperable Reusable) principles for scientific data is transforming the state-of-practice management stewardship, supporting enabling discovery innovation. Learning from this initiative, acknowledging impact artificial intelligence (AI) in practice science engineering, we introduce a practical, concise, AI models. We showcase how to create share models within unified computational framework combining following elements:...
Coherent imaging techniques provide an unparalleled multi-scale view of materials across scientific and technological fields, from structural to quantum devices, integrated circuits biological cells. Driven by the construction brighter sources high-rate detectors, coherent methods like ptychography are poised revolutionize nanoscale characterization. However, these advancements accompanied significant increase in data compute needs, which precludes real-time imaging, feedback decision-making...
In recent years, thanks to the low cost of deploying, maintaining an Unmanned Aerial Vehicle (UAV) system and possibility operating them in areas inaccessible or dangerous for human pilots, UAVs have attracted much research attention both military field civilian application. order deal with more sophisticated tasks, such as searching survival points, multiple target monitoring tracking, application UAV swarms is forseen. This requires complex control, communication coordination mechanisms....
Disk-to-disk wide-area file transfers involve many subsystems and tunable application parameters that pose significant challenges for bottleneck detection, system optimization, performance prediction. Performance models can be used to address these but have not proved generally usable because of a need extensive online experiments characterize subsystems. We show here how overcome the such by applying machine learning methods historical data estimate predictive models. Starting with log...
HPC workload analysis and resource consumption characteristics are the key to driving better operation practices, system procurement decisions, designing effective management techniques. Unfortunately, community does not have easy accessibility long-term introspective work-load characterization for production-scale systems. This study bridges this gap by providing detailed quantification, characterization, of job on two supercomputers: Intrepid Mira. is one largest its kind - covering trends...
X-ray diffraction based microscopy techniques such as high-energy (HEDM) rely on knowledge of the position peaks with high precision. These positions are typically computed by fitting observed intensities in detector data to a theoretical peak shape pseudo-Voigt. As experiments become more complex and technologies evolve, computational cost peak-shape becomes biggest hurdle rapid analysis required for real-time feedback experiments. To this end, we propose BraggNN, deep-learning method that...
Query performance (e.g., execution time) prediction is a critical component of modern DBMSes. As pioneering cloud data warehouse, Amazon Redshift relies on an accurate time for many downstream tasks, ranging from high-level optimizations, such as automatically creating materialized views, to low-level tasks the path query execution, admission, scheduling, and resource control. Unfortunately, existing techniques, including those used in Redshift, suffer cold start issues, inaccurate...
Abstract Enabled by multiple modalities of smart materials-based actuation and sensing, full waveform inversion (FWI) nowadays is an advanced ultrasound computed tomography technique that utilizes data to generate high-resolution images scanned regions. This technology offers promise for defect/damage detection disease diagnosis, showing potential in nondestructive testing, structural health monitoring, medical imaging. To reduce the lengthy computational time caused time-domain FWI, modern...
Abstract Ultrasound computed tomography (USCT) shows great promise in nondestructive evaluation and medical imaging due to its ability quickly scan collect data from a region of interest. However, existing approaches are tradeoff between the accuracy prediction speed at which can be analyzed, processing collected into meaningful image requires both time computational resources. We propose develop convolutional neural networks (CNNs) accelerate enhance inversion results reveal underlying...
Cloud-based data warehouses are built to be easy use, requiring minimal intervention from customers as their workloads scale. However, there still many dimensions of a workload that they do not scale with automatically. For example, in cloud-managed clusters, large ad-hoc queries and ETL must use the same cluster size provisioned for rest workload, warehouse does automatically grow underlying grows size, causing slow down. In this paper, we describe RAIS, latest collection AI-powered scaling...
Wide area data transfers play an important role in many science applications but rely on expensive infrastructure that often delivers disappointing performance practice. In response, we present a systematic examination of large set transfer log to characterize characteristics, including the nature datasets transferred, achieved throughput, user behavior, and resource usage. This analysis yields new insights can help design better tools, optimize networking edge resources used for transfers,...
Assimilating Sentinel-2 images with the CERES-Wheat model can improve precision of winter wheat yield estimates at a regional scale. To verify this method, we applied ensemble Kalman filter (EnKF) to assimilate leaf area index (LAI) derived from data and simulated by model. From this, obtained assimilated daily LAI during growth stage across three counties located in southeast Loess Plateau China: Xiangfen, Xinjiang, Wenxi. We assigned weights different stages comparing improved analytic...
Extremely high data rates at modern synchrotron and X-ray free-electron laser light source beamlines motivate the use of machine learning methods for reduction, feature detection, other purposes. Regardless application, basic concept is same: collected in early stages an experiment, from past similar experiments, and/or simulated upcoming experiment are used to train models that, effect, learn specific characteristics those data; these then process subsequent more efficiently than would...
Wide area file transfers play an important role in many science applications. File transfer tools typically deliver the highest performance for datasets with a small number of large files, but consist files. Thus it is to understand factors that contribute decrease wide data To this end, we (i) benchmark subsystems involved end-to-end between two HPC facilities many-file dataset representative production transfers; (ii) characterize per-file overhead introduced by different subsystems; (iii)...
Experimental protocols at synchrotron light sources typically process and validate data only after an experiment has completed, which can lead to undetected errors cannot enable online steering. Real-time analysis both detection of, recovery from, errors, optimization of acquisition. However, modern scientific instruments, such as detectors sources, generate GBs/sec rates. Data processing methods the widely used computational tomography usually require considerable resources, yield poor...