- Statistical Distribution Estimation and Applications
- Cloud Computing and Resource Management
- Probabilistic and Robust Engineering Design
- Statistical Methods and Bayesian Inference
- Bayesian Methods and Mixture Models
- Reliability and Maintenance Optimization
- Data Management and Algorithms
- Statistical Methods and Inference
- Advanced Database Systems and Queries
- Anomaly Detection Techniques and Applications
- 3D Shape Modeling and Analysis
- Probability and Risk Models
- Semiconductor materials and devices
- Data Stream Mining Techniques
- Smart Grid Energy Management
- Energy Load and Power Forecasting
- Advanced Statistical Methods and Models
- Distributed and Parallel Computing Systems
- Software System Performance and Reliability
- Integrated Circuits and Semiconductor Failure Analysis
- Electric Power System Optimization
- Topic Modeling
- Spam and Phishing Detection
- demographic modeling and climate adaptation
- Stochastic Gradient Optimization Techniques
Amazon (United States)
2018-2024
McMaster University
2010-2024
University of California, San Diego
2014-2015
IBM Research - India
2012-2014
IBM (United States)
2013
Carnegie Mellon University
2006
Personalized real-time recommendation has had a profound impact on retail, media, entertainment and other industries. However, developing recommender systems for every use case is costly, time consuming resource-intensive. To fill this gap, we present black-box system that can adapt to diverse set of scenarios without the need manual tuning. We build techniques go beyond simple matrix factorization incorporate important new sources information: temporal order events [Hidasi et al., 2015],...
There has been a lot of excitement around using machine learning to improve the performance and usability database systems. However, few these techniques have actually used in critical path customer-facing services. In this paper, we describe Auto-WLM, based automatic workload manager currently production Amazon Redshift. Auto-WLM is an example how can large data-warehouses practice at scale. intelligently schedules workloads maximize throughput horizontally scales clusters response spikes....
Query scheduling is a critical task that directly impacts query performance in database management systems (DBMS). Deeply integrated schedulers, which require changes to DBMS internals, are usually customized for specific engine and can take months implement. In contrast, non-intrusive schedulers make coarse-grained decisions, such as controlling admission re-ordering execution, without requiring modifications internals. They much less engineering effort be applied across wide range of...
Growing environmental awareness and new government directives have set the stage for an increase in fraction of energy supplied using renewable resources. The fast variation power, coupled with uncertainty availability, emphasizes need algorithms intelligent online generation scheduling. These should allow us to compensate resource when it is not available also account physical generator constraints. We apply extend recent work field optimization scheduling generators smart (micro) grids...
Large scale deployment of sensors is essential to practical applications in cyber physical systems. For instance, instrumenting a commercial building for 'smart energy' management requires and operation thousands measurement metering actuators that direct the HVAC system. Each these need be named consistently constantly calibrated. Doing this process manually not only time consuming but also error prone given scale, heterogeneity complexity buildings as well lack uniform naming schemas. To...
Query performance (e.g., execution time) prediction is a critical component of modern DBMSes. As pioneering cloud data warehouse, Amazon Redshift relies on an accurate time for many downstream tasks, ranging from high-level optimizations, such as automatically creating materialized views, to low-level tasks the path query execution, admission, scheduling, and resource control. Unfortunately, existing techniques, including those used in Redshift, suffer cold start issues, inaccurate...
Cloud-based data warehouses are built to be easy use, requiring minimal intervention from customers as their workloads scale. However, there still many dimensions of a workload that they do not scale with automatically. For example, in cloud-managed clusters, large ad-hoc queries and ETL must use the same cluster size provisioned for rest workload, warehouse does automatically grow underlying grows size, causing slow down. In this paper, we describe RAIS, latest collection AI-powered scaling...
Estimating the wake losses in a wind farm is critical short term forecast of power, following Numerical Weather Prediction (NWP) approach. Understanding intensity wakes and nature its propagation within still remains challenge to scientist, engineers utility operators. In this paper, five different machine learning methods are used estimate power deficit experienced by turbines due losses. Production data from Horns Rev offshore farm, Denmark, have been for study. The linear regression,...
Circuit designers typically combat variations in hardware and workload by increasing conservative guardbanding that leads to operational inefficiency. Reducing this excessive guardband is highly desirable, but causes timing errors synchronous circuits. We propose a methodology for supervised learning based models predict at bit-level. show logistic regression model can effectively errors, given amount of reduction. The proposed enables model-based rule method reduce subject required...
<para xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> The objective of this paper is to provide a new estimation method for parametric models under progressive Type-I censoring. First, we propose Kaplan-Meier nonparametric estimator the reliability function taken at censoring times. It based on observable number failures, and censored units occurring from scheme This then shown asymptotically follow normal distribution. Next, minimum-distance...
As the output from solar PV systems varies significantly with technologies, designs and prevailing weather parameters, their evaluation under actual field conditions is important in identifying real performance characteristics. In this paper, comparative performances of six different connected to a 1.2 MWp grid integrated farm are presented. The technologies considered single crystalline (sc-Si), poly (mc-Si), micro (nc-Si/a-Si), amorphous silicon (a-Si), Copper Indium Selenium (CIS)...
Rashmi Gangadharaiah, Balakrishnan Narayanaswamy, Charles Elkan. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers). 2018.
Growing fuel costs, environmental awareness, government directives, an aggressive push to deploy Electric Vehicles (EVs) (a single EV consumes the equivalent of 3 10 homes) have led a severe strain on grid already brink. Maintaining stability requires automatic agent based control these loads and rapid coordination between them. In literature, number iterative pricing, signaling tâtonnement (or bargaining) approaches been proposed allow smart homes, storage devices autonomous agents that...
The process of locating the end points each speakers voice in an audio file and then clustering segments based speaker identity is called segmentation. In this paper we present a method for two segmentation, though it can be extended to more than speakers. Most methods segmentation start with initial computationally inexpensive method, followed by accurate segment clustering. describe simple algorithm that improves accuracy while not increasing computational complexity. Since done...
Query performance (e.g., execution time) prediction is a critical component of modern DBMSes. As pioneering cloud data warehouse, Amazon Redshift relies on an accurate time for many downstream tasks, ranging from high-level optimizations, such as automatically creating materialized views, to low-level tasks the path query execution, admission, scheduling, and resource control. Unfortunately, existing techniques, including those used in Redshift, suffer cold start issues, inaccurate...
Abstract In this article, we consider test for the two null hypotheses and , widely useful tests in reliability, based on ranked set sampling (RSS). We derive likelihood ratio as well associated exact asymptotic results. Considering a fixed significance level power of test, show that proposed statistic outperforms existing test. small sample cases, leads to much narrower confidence interval reliability function . Then, statistics obtained from simple random RSS schemes are compared through...