- Cloud Computing and Resource Management
- Scientific Computing and Data Management
- Distributed and Parallel Computing Systems
- Gene expression and cancer classification
- Geochemistry and Geologic Mapping
- Cancer Genomics and Diagnostics
- Remote-Sensing Image Classification
- Advanced Image and Video Retrieval Techniques
- Open Source Software Innovations
- Muscle activation and electromyography studies
- EEG and Brain-Computer Interfaces
- Motor Control and Adaptation
- Bioinformatics and Genomic Networks
- Remote Sensing in Agriculture
- Insect Resistance and Genetics
- Digital Innovation in Industries
- Genetic factors in colorectal cancer
- IoT and Edge/Fog Computing
- Software Engineering Research
- Software-Defined Networks and 5G
Argonne National Laboratory
2021-2024
University of Chicago
2012-2021
University of Houston
2012
University of Illinois Chicago
2012
As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, analyze them. One possible approach provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along tools resources it.
In this paper we describe the design, and implementation of Open Science Data Cloud, or OSDC. The goal OSDC is to provide petabyte-scale data cloud infrastructure related services for scientists working with large quantities data. Currently, consists more than 2000 cores 2 PB storage distributed across four centers connected by 10G networks. We discuss some lessons learned during past three years operation software stacks used in also research projects biology, earth sciences, social sciences enabled
The Exascale Computing Project (ECP) software deployment effort developed and advanced DevOps capabilities. One goal was to enable robust continuous integration (CI) workflows that span the protected high performance computing (HPC) environments found within many of Department Energy's (DOE) national laboratories. This article highlights several challenges encountered with enabling automation, such as charging models for CI jobs, meeting individualized security requirements revolve around...
Hadoop has emerged as an important platform for data intensive computing. The shuffle and sort phases of a MapReduce computation often saturate top the rack switches, well switches that aggregate multiple racks. In addition, computations have "hot spots" in which is lengthened due to inadequate bandwidth some nodes. principle, OpenFlow enables application adjust network topology required by computation, providing additional those resources requiring it. We describe Hadoop-OFE, enabled...
Project Matsu is a collaboration between the Open Commons Consortium and NASA focused on developing open source technology for cloud-based processing of Earth satellite imagery. A particular focus development applications detecting fires floods to help support natural disaster detection relief. has developed an infrastructure process, analyze, reanalyze large collections hyperspectral image data using OpenStack, Hadoop, MapReduce, Storm related technologies. We describe framework efficient...
Project Matsu is a collaboration between the Open Commons Consortium and NASA focused on developing open source technology for cloud-based processing of Earth satellite imagery detecting fires floods to help support natural disaster detection relief. We describe framework efficient analysis reanalysis large amounts data called "Wheel" analytics used process hyperspectral produced daily by NASA's Observing-1 (EO-1) satellite. The wheel designed be able scanning queries using cloud computing...