Anshuman Tripathi

ORCID: 0000-0002-4902-3719
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Multilevel Inverters and Converters
  • Advanced DC-DC Converters
  • Speech and Audio Processing
  • Music and Audio Processing
  • Microgrid Control and Optimization
  • Silicon Carbide Semiconductor Technologies
  • Electric Motor Design and Analysis
  • Induction Heating and Inverter Technology
  • Sensorless Control of Electric Motors
  • Advanced Battery Technologies Research
  • Topic Modeling
  • Autonomous Vehicle Technology and Safety
  • Magnetic Properties and Applications
  • Natural Language Processing Techniques
  • Electric Vehicles and Infrastructure
  • Advanced Manufacturing and Logistics Optimization
  • Islanding Detection in Power Systems
  • Wind Turbine Control Systems
  • Electric and Hybrid Vehicle Technologies
  • Aerospace and Aviation Technology
  • Real-time simulation and control systems
  • Indoor and Outdoor Localization Technologies
  • Power Systems and Renewable Energy
  • Advanced Neural Network Applications

Nanyang Technological University
2016-2025

Google (United States)
2018-2024

Technological Institute of the Philippines
2021

National Institute of Technology Raipur
2021

Indian Institute of Technology Bombay
2017

Indian Institute of Technology Kharagpur
2017

National University of Singapore
2004-2006

General Electric (United States)
2006

In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming system. computation blocks based on self-attention are to encode both audio and label sequences independently. The activations from combined feed-forward layer compute probability distribution over the space for every combination of acoustic frame position history. This is similar Recurrent Neural Network Transducer (RNN-T) model, which uses RNNs information encoding...

10.1109/icassp40776.2020.9053896 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Current state-of-the-art automatic speech recognition systems are trained to work in specific `domains', defined based on factors like application, sampling rate and codec. When such recognizers used conditions that do not match the training domain, performance significantly drops. This explores idea of building a single domain-invariant model for varied use-cases by combining large scale data from multiple application domains. Our final system is using 162,000 hours speech. Additionally,...

10.1109/slt.2018.8639610 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2018-12-01

In this paper, we present a novel speaker diarization system for streaming on-device applications. system, use transformer transducer to detect the turns, represent each turn by embedding, then cluster these embeddings with constraints from detected turns. Compared conventional clustering-based systems, our largely reduces computational cost of clustering due sparsity Unlike other supervised systems which require annotations time-stamped labels training, only requires including tokens during...

10.1109/icassp43922.2022.9746531 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Voice activity detection (VAD) is the task of predicting which parts an utterance contains speech versus background noise. It important first step to determine samples send decoder and when close microphone. The long short-term memory neural network (LSTM) a popular architecture for sequential modeling acoustic signals, has been successfully used in several VAD applications. However, it observed that LSTMs suffer from state saturation problems (i.e., voice dictation tasks), thus requires...

10.1109/icassp.2018.8461921 article EN 2018-04-01

In this paper we document our experiences with developing speech recognition for medical transcription -a system that automatically transcribes doctor-patient conversations.Towards goal, built a along two different methodological lines Connectionist Temporal Classification (CTC) phoneme based model and Listen Attend Spell (LAS) grapheme model.To train these models used corpus of anonymized conversations representing approximately 14,000 hours speech.Because noisy transcripts alignments in...

10.21437/interspeech.2018-40 article EN Interspeech 2022 2018-08-28

At high angular velocity, the induction motor is operated in field weakening range due to voltage limit of inverter. Field oriented vector control (FOC) unsuitable for this operation duetocoupling, non-linearities,andsaturationof linear current controllers. A proposed direct torque space modulation (DTC–SVM) scheme using SVM does not use coordinate transforms or controllers achieve DTC. Control stator flux allows dynamic change all regions,including with six-step operation. This paper...

10.1109/tpel.2006.876823 article EN IEEE Transactions on Power Electronics 2006-07-01

Recurrent Neural Network Transducer (RNNT) is an end-to-end model which transduces discrete input sequences to output by learning alignments between the sequences. In speech recognition tasks we generally have a strictly monotonic alignment time frames and label sequence. However, standard RNNT loss does not enforce this constraint. This can cause some anomalies in such as outputting sequence of labels at single frame. There also no bound on decoding steps. To address these problems,...

10.1109/asru46091.2019.9003822 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019-12-01

To achieve maximum profit by dispatching a battery storage system in an arbitrage operation, multiple factors must be considered. While revenue from the application is determined time variability of electricity cost, will lowered costs resulting energy efficiency losses, as well degradation. In this paper, optimal dispatch strategy proposed for systems trading on markets. The based computationally-efficient implementation mixed-integer linear programming method, with cost function that...

10.3390/en12060999 article EN cc-by Energies 2019-03-14

The rapid increase of renewable energy sources made coordinated control the distributed and intermittent generation units a more demanded task. Matching demand supply is particularly challenging in islanded microgrids. In this study, we have demonstrated mixed-integer quadratic programming (MIQP) method to achieve efficient use within an microgrid. A unique objective function involving fuel consumption diesel generator, degradation lithium-ion battery storage system, carbon emissions, load...

10.1002/er.4512 article EN International Journal of Energy Research 2019-06-03

In this paper we present an end-to-end speech recognition system that can recognize single-channel where multiple talkers speak at the same time (overlapping speech) by using a neural network model based on Recurrent Neural Network Transducer (RNN-T) architecture. We augment conventional RNN-T architecture including masking for separation of encoded audio features, and label encoders to encode transcripts from different speakers. use L2 loss prevent align wrong speakers' audio, speaker...

10.1109/icassp40776.2020.9054328 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

In this paper we present a Transformer-Transducer model architecture and training technique to unify streaming non-streaming speech recognition models into one model. The is composed of stack transformer layers for audio encoding with no lookahead or right context an additional on top trained variable context. inference time, the length can be changed trade off latency accuracy We also show that run in Y-model running parallel low high modes. This allows us have results limited delayed large...

10.48550/arxiv.2010.03192 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Dispatch of battery storage systems for stationary grid applications is a topic increasing interest: due to the volatility power system's energy supply relying on variable renewable sources, one foresees rising demand and market potential both short- long-term fluctuation smoothing via storage. While revenue attainable arbitrage trading may yet surpass steadily declining cost lithium-ion systems, profitability will be constrained directly by limited lifetime system lowered dissipation losses...

10.1109/access.2020.3035504 article EN cc-by IEEE Access 2020-01-01

Medium/high-frequency transformer is an integral part of many power conversion systems. Switching at higher frequency results in lesser volume magnetics but induces winding loss density, on account increased eddy current effects conductors. Thus resistance a key parameter to characterize performance medium-frequency (MF) highpower (HP) transformer. In this paper, 10 kW, 0.5/2.5 kV, 1 kHz designs are presented employing different dispositions (normal and interleaved) conductor geometries...

10.1109/acept.2017.8168612 article EN 2017-10-01

Medium and high frequency, power transformers play an important role in footprint reduction along with their functions of galvanic isolation, voltage transformation all converters typically used traction systems, offshore wind plant converters, solid state transformer based distribution system grids. This art report analysis the various materials design tradeoffs that govern electromagnetic behavior loss mechanisms medium frequency applications. Typical winding core geometries have been...

10.1109/acept.2016.7811550 article EN 2016-10-01

In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming system. computation blocks based on self-attention are to encode both audio and label sequences independently. The activations from combined feed-forward layer compute probability distribution over the space for every combination of acoustic frame position history. This is similar Recurrent Neural Network Transducer (RNN-T) model, which uses RNNs information encoding...

10.48550/arxiv.2002.02562 preprint EN other-oa arXiv (Cornell University) 2020-01-01

This paper introduces contrastive siamese (c-siam) network, an architecture for leveraging unlabeled acoustic data in speech recognition. c-siam is the first network that extracts high-level linguistic information from by matching outputs of two identical transformer encoders. It contains augmented and target branches which are trained by: (1) masking inputs with a loss, (2) incorporating stop gradient operation on branch, (3) using extra learnable transformation (4) introducing new temporal...

10.1109/icassp43922.2022.9747355 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Offshore wind power has inspired the fields of high voltage direct current (HVdc) for advantages transmission in long distance. Hefty generators are making advanced multilevel rectifier and parallel operation rectifiers popular choice research with aim to accommodate higher power. Issues reliability complexity control associated active electronic devices at such This paper focuses on novelty three-phase diode each auxiliary bidirectional switching blocks (BSB) improve their performance. For...

10.1109/tia.2018.2870820 article EN IEEE Transactions on Industry Applications 2018-09-17

In this paper, the effect of capacitor voltage ripple on current quality in a cascaded H-bridge (CHB) low-capacitance static compensator (LC-StatCom) with symmetrical I-V characteristics is investigated. Total harmonic distortion synthesized ac an converter operating different ripples evaluated for both inductive and capacitive modes. Simulation-based analyses 350-VA three-cell CHB LC-StatCom system are provided to demonstrate LC-StatCom's effectiveness provide high compared conventional StatCom.

10.1109/acept.2017.8168597 article EN 2017-10-01

A medium/high-power conversion system, using power electronic (PE) converter in conjunction with a medium/high-frequency transformer, has many desirable effects suitably oriented for modern system architecture. Switching at high frequency results lesser volume of magnetics but induces higher loss density. Thus design and characterization medium-frequency (MF) high-power (HP) transformer significant ramification on its performance application. Thermal management MF HP is one key aspects...

10.1109/iecon.2017.8216372 article EN IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society 2017-10-01

A medium/high-power conversion system using power electronic (PE) converter along with a medium/high-frequency transformer, offers many desirable features that are beneficial for present-day topologies. Leakage inductance is identified to be one of the key parameters characterize performances such medium-frequency (MF) high-power (HP) transformer. In this paper, existing analytical method calculate leakage concentric winding further refined employing mean turn length individual layer and...

10.1109/spec.2017.8333556 article EN 2017-12-01

Increasing power consumption requires engineers to find better control techniques increase energy efficiency. Advancements in technology allows us use more complex algorithms pursue this goal. Load frequency (LFC) is one of the vital points system and a state art method must be used ensure quality grid. In work, decentralized model predictive controller (MPC) with generation rate constraints handle LFC problem four area interconnected system. It seen that MPC successfully achieved given...

10.1109/acept.2016.7811530 article EN 2016-10-01

To find balance among multiple design objectives of a medium/high-frequency (MF/HF) high-power (HP) transformer is best addressed employing an optimization technique. In this paper, MF HP formulated as multi-variable problem, where efficiency, power density and temperature rise are chosen objectives. Total loss, core volume maximum modeled respective cost functions amalgamated using weighted-sum approach to derive objective function. It minimized Steepest descent method. Being gradient-based...

10.1109/apec.2018.8341259 article EN 2022 IEEE Applied Power Electronics Conference and Exposition (APEC) 2018-03-01
Coming Soon ...