- Parallel Computing and Optimization Techniques
- Advanced Neural Network Applications
- Advanced Data Storage Technologies
- Software Testing and Debugging Techniques
- IoT and Edge/Fog Computing
- Recommender Systems and Techniques
- Freezing and Crystallization Processes
- Advanced machining processes and optimization
- Food Industry and Aquatic Biology
- Topic Modeling
- Green IT and Sustainability
- Mobile Crowdsensing and Crowdsourcing
- Advanced Surface Polishing Techniques
- Explainable Artificial Intelligence (XAI)
- Advanced Machining and Optimization Techniques
- Software Engineering Research
Delhi Technological University
2021
Meta (United States)
2020
Menlo School
2020
This paper explores the environmental impact of super-linear growth trends for AI from a holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize carbon footprint computing by examining model development cycle across industry-scale machine learning use cases and, at same time, considering life system hardware. Taking step further, we capture operational manufacturing present an end-to-end analysis what how hardware-software design at-scale optimization can help...
This paper introduces two extensions to the popular PyTorch machine learning framework, TorchDynamo and TorchInductor, which implement torch.compile feature released in 2. is a Python-level just-in-time (JIT) compiler that enables graph compilation programs without sacrificing flexibility of Python. It achieves this by dynamically modifying Python bytecode before execution extracting sequences operations into an FX graph, then JIT compiled using one many extensible backends. TorchInductor...
It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite remarkable progress made in field machine learning systems research, which has enabled development and exploration models, such abilities remain confined small group advanced users industry leaders, resulting an implicit technical barrier for wider community access leverage these technologies. In this paper, we introduce PyTorch Fully Sharded Data Parallel...
It is widely acknowledged that large models have the potential to deliver superior performance across a broad range of domains. Despite remarkable progress made in field machine learning systems research, which has enabled development and exploration models, such abilities remain confined small group advanced users industry leaders, resulting an implicit technical barrier for wider community access leverage these technologies. In this paper, we introduce PyTorch Fully Sharded Data Parallel...
In this tutorial we show how to build deep learning recommendation systems and resolve the associated interpretability, integrity privacy challenges. We start with an overview of PyTorch framework, features that it offers a brief review evolution models. delineate their typical components proxy model (DLRM) in PyTorch. Then, discuss interpret system results as well address corresponding quality