Qi Liu

ORCID: 0000-0003-3067-8333
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Sensor and Control Systems
  • Speech Recognition and Synthesis
  • Advanced Algorithms and Applications
  • Advanced Computational Techniques and Applications
  • Music and Audio Processing
  • Speech and Audio Processing
  • Industrial Technology and Control Systems
  • Embedded Systems and FPGA Design
  • Industrial Automation and Control Systems
  • Rough Sets and Fuzzy Logic
  • Advanced Measurement and Detection Methods
  • Wireless Sensor Networks and IoT
  • Machine Learning in Bioinformatics
  • Simulation and Modeling Applications
  • Machine Learning in Materials Science
  • Multimodal Machine Learning Applications
  • Text and Document Classification Technologies
  • Quality and Safety in Healthcare
  • Human Pose and Action Recognition
  • Distributed Control Multi-Agent Systems
  • Advanced Decision-Making Techniques
  • Speech and dialogue systems
  • Protein Structure and Dynamics

Shanghai Jiao Tong University
2003-2024

Jiaozuo University
2024

Northeast Normal University
2024

Sun Yat-sen University
2016-2023

National University of Defense Technology
2010-2021

Hubei University of Technology
2021

Hiroshima University
2021

CRRC (China)
2021

Tianjin Chengjian University
2006-2019

Zhejiang Sci-Tech University
2018

With the progress of 3D human pose and shape estimation, state-of-the-art methods can either be robust to occlusions or obtain pixel-aligned accuracy in non-occlusion cases. However, they cannot robustness mesh-image alignment at same time. In this work, we present NIKI (Neural Inverse Kinematics with Invertible Neural Network), which models bidirectional errors improve accuracy. learn from both forward inverse processes invertible networks. process, model separates error plausible manifold...

10.1109/cvpr52729.2023.01243 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

QUALIFLEX (QUALItative FLEXible multiple criteria method) is one of useful outranking methods for analyzing decision problems because its flexibility with respect to the information cardinal and ordinal. This paper developed a probabilistic linguistic method p ossibility degree comparison dealing group making problems, in which evaluation alternatives are expressed by hesitant fuzzy sets standard weights partially known. Note that it more reasonable using term (PLTSs) represent whole...

10.3233/jifs-172112 article EN Journal of Intelligent & Fuzzy Systems 2018-07-10

End-to-end (E2E) automatic speech recognition (ASR) systems directly map acoustics to words using a unified model. Previous works mostly focus on E2E training single model which integrates acoustic and language into whole. Although benefits from sequence modeling simplified decoding pipelines, large amount of transcribed data is usually required, traditional modelling techniques cannot be utilized. In this paper, novel modular framework ASR proposed separately train neural models during...

10.1109/icassp.2018.8461361 preprint EN 2018-04-01

Cognitive diagnosis has been developed for decades as an effective measurement tool to evaluate human cognitive status such ability level and knowledge mastery. It applied a wide range of fields including education, sport, psychological diagnosis, etc. By providing better awareness status, it can serve the basis personalized services well-designed medical treatment, teaching strategy vocational training. This paper aims provide survey current models with more attention on new developments...

10.48550/arxiv.2407.05458 preprint EN arXiv (Cornell University) 2024-07-07

Computer Aided Design (CAD) is indispensable across various industries. \emph{Text-based CAD editing}, which automates the modification of models based on textual instructions, holds great potential but remains underexplored. Existing methods primarily focus design variation generation or text-based generation, either lacking support for control neglecting existing as constraints. We introduce \emph{CAD-Editor}, first framework editing. To address challenge demanding triplet data with...

10.48550/arxiv.2502.03997 preprint EN arXiv (Cornell University) 2025-02-06

Deep Bidirectional Long Short-Term Memory (DBLSTM) with a Connectionist Temporal Classification (CTC) output layer has been established as one of the state-of-the-art solutions for handwriting recognition. It is well-known that DBLSTM trained by using CTC objective function will learn both local character image dependency modeling and long-range contextual implicit language modeling. In this paper, we study effects explicit model information DBLSTM-CTC based recognition comparing performance...

10.1109/icdar.2015.7333804 article EN 2015-08-01

Large language models (LLMs), like ChatGPT, have shown some human-like cognitive abilities. For comparing these abilities of different models, several benchmarks (i.e. sets standard test questions) from fields (e.g., Literature, Biology and Psychology) are often adopted the results under traditional metrics such as accuracy, recall F1, reported. However, way for evaluating LLMs can be inefficient inaccurate science perspective. Inspired by Computerized Adaptive Testing (CAT) used in...

10.48550/arxiv.2306.10512 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Large memory consumption of the neural network language models (NN LMs) prohibits their use in many resource-constrained scenarios. Hence, effective NN LM compression approaches that are independent structures great interest. However, previous usually achieve a high ratio at cost obvious performance loss. In this paper, two recently proposed quantization approaches, product (PQ) and soft binarization effectively combined to address issue. PQ decomposes word embedding matrices into Cartesian...

10.1109/taslp.2020.3015659 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2020-01-01

Although great progress has been made in automatic speech recognition (ASR), significant performance degradation still exists noisy environments. Based on our previous introduced very deep CNNs, this paper further integrates residual learning to evaluate convolutional network (VDCRN) conditions, which shows more powerful robustness. Then, cluster adaptive training (CAT) is developed the VDCRN reduce mismatch between and testing scenarios. Moreover, advanced future-vector assisted LSTM-RNN LM...

10.1109/icassp.2018.8462629 article EN 2018-04-01

End-to-end (E2E) systems have played a more and important role in automatic speech recognition (ASR) achieved great performance. However, E2E recognize output word sequences directly with the input acoustic feature, which can only be trained on limited data. The extra text data is widely used to improve results of traditional artificial neural network-hidden Markov model (ANN-HMM) hybrid systems. involving standard ASR may break property during decoding. In this paper, novel modular system...

10.1109/taslp.2020.3009477 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2020-01-01

Melanoma is a highly metastatic and lethal skin tumor originating from melanocyte malignancy. Circulating cells (CTCs) are key endogenous biomarkers in melanoma metastasis. Melanin blood vessels exhibit substantial disparities their absorbance profiles at select wavelengths, characteristic that can be adeptly harnessed to differentiate the photoacoustic signals they generate. Photoacoustic flow cytometry (PAFC), which harnesses this principle, enables monitoring of CTC flowing vivo. However,...

10.1063/5.0226328 article EN cc-by APL Photonics 2024-10-01

The long short-term memory language model (LSTM LM) has been widely investigated in large vocabulary continuous speech recognition (LVCSR) task. Despite the excellent performance of LSTM LM, its usage resource-constrained environments, such as portable devices, is limited due to high consumption memory. Binarized proposed achieve significant reduction at cost degradation compression ratio. In this paper, we propose a soft binarization approach recover binarized LM. Experiments show that...

10.1109/asru46091.2019.9003744 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019-12-01

To improve the accuracy of automatic speech recognition, a two-pass decoding strategy is widely adopted. The first-pass model generates compact word lattices, which are utilized by second-pass to perform rescoring. Currently, most popular rescoring methods N-best and lattice with long short-term memory language models (LSTMLMs). However, these encounter problem limited search space or inconsistency between training evaluation. In this paper, we address problems an end-to-end for accurately...

10.1109/icassp40776.2020.9054109 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Neural network language models have gained considerable popularity due to their promising performance. Distributed word embeddings are utilized represent semantic information. However, each is associated with a single vector in the embedding layer, disabling model from capturing meanings of polysemous words. In this work, we address problem by assigning multiple fine-grained sense layers. The proposed discriminates among different senses attention mechanism an unsupervised manner....

10.1109/icassp40776.2020.9053503 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09
Coming Soon ...