- Natural Language Processing Techniques
- Topic Modeling
- Speech Recognition and Synthesis
- Higher Education and Teaching Methods
- Information and Cyber Security
- Numerical methods for differential equations
- Differential Equations and Numerical Methods
- Web and Library Services
- Data Mining Algorithms and Applications
- Multimodal Machine Learning Applications
- Vibration and Dynamic Analysis
- Bayesian Methods and Mixture Models
- Ultrasonics and Acoustic Wave Propagation
- Statistical Methods and Bayesian Inference
- Speech and dialogue systems
- Library Science and Information Literacy
- Thermography and Photoacoustic Techniques
- Advanced Manufacturing and Logistics Optimization
- Robotics and Sensor-Based Localization
- AI and Big Data Applications
- Information Retrieval and Search Behavior
- Zebrafish Biomedical Research Applications
- Evaluation and Optimization Models
- Globalization, Economics, and Policies
- Human Pose and Action Recognition
Renmin University of China
2024-2025
Harbin Engineering University
2024
Huazhong University of Science and Technology
2020-2023
Harbin Institute of Technology
2022
Nankai University
2022
China Academy of Space Technology
2022
Shenzhen Institutes of Advanced Technology
2021
Chinese Academy of Sciences
2021
Beijing Normal University
2008-2020
University of Electronic Science and Technology of China
2016-2020
Information Retrieval (IR) systems are crucial tools for users to access information, which have long been dominated by traditional methods relying on similarity matching. With the advancement of pre-trained language models, generative information retrieval (GenIR) emerges as a novel paradigm, attracting increasing attention. Based form provided users, current research in GenIR can be categorized into two aspects: (1) Generative Document (GR) leverages model’s parameters memorizing...
Large reasoning models (LRMs) like OpenAI-o1 have demonstrated impressive long stepwise capabilities through large-scale reinforcement learning. However, their extended processes often suffer from knowledge insufficiency, leading to frequent uncertainties and potential errors. To address this limitation, we introduce \textbf{Search-o1}, a framework that enhances LRMs with an agentic retrieval-augmented generation (RAG) mechanism Reason-in-Documents module for refining retrieved documents....
Generative information retrieval, encompassing two major tasks of Document Retrieval (GDR) and Grounded Answer Generation (GAR), has gained significant attention in natural language processing. Existing methods for GDR GAR rely on separate retrieval reader modules, which hinder simultaneous optimization. To overcome this, we present UniGen, a Unified framework question answering that integrates both into single generative model leveraging the capabilities large models. UniGen employs shared...
This paper proposes a novel architecture, Cross Attention Augmented Transducer (CAAT), for simultaneous translation. The framework aims to jointly optimize the policy and translation models. To effectively consider all possible READ-WRITE action paths, we adapt online automatic speech recognition (ASR) model, RNN-T, but remove strong monotonic constraint, which is critical task reordering. make CAAT work, introduce latency loss whose expectation can be optimized by forward-backward...
Generative document retrieval is a novel framework, which represents documents as identifiers (DocID) and retrieves by generating DocIDs. It has the advantage of end-to-end optimization over traditional methods attracted much research interest. Nonetheless, development efficient precise DocIDs for representation remains pertinent issue within field. Existing designing tend to consider only relevance corresponding documents, while neglecting ability distinguish from similar ones, crucial...
Weitai Zhang, Zhongyi Ye, Haitao Tang, Xiaoxi Li, Xinyuan Zhou, Jing Yang, Jianwei Cui, Pan Deng, Mohan Shi, Yifan Song, Dan Liu, Junhua Lirong Dai. Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022). 2022.
Abstract. Systematic sampling is frequently used in surveys, because of its ease implementation and design efficiency. An important drawback systematic sampling, however, that no direct estimator the variance available. We describe a new model‐based expectation variance, under non‐parametric model for population. The sufficiently flexible it can be expected to hold at least approximately many situations with continuous auxiliary variables observed population level. prove consistency both...
This paper describes USTC-NELSLIP’s submissions to the IWSLT2021 Simultaneous Speech Translation task. We proposed a novel simultaneous translation model, Cross-Attention Augmented Transducer (CAAT), which extends conventional RNN-T sequence-to-sequence tasks without monotonic constraints, e.g., translation. Experiments on speech-to-text (S2T) and text-to-text (T2T) shows CAAT achieves better quality-latency trade-offs compared wait-k, one of previous state-of-the-art approaches. Based...
Multilayer materials with metal-metal bonded structure have been widely applied in aviation, aerospace, and nuclear industry. Disbond is prone to exist lead-steel structure, which degrades the load capacity mechanical behaviors. Thermography nondestructive testing a potential candidate for sub-layer defect detection. However, unbearable when undertaken over-heating of instantaneous temperature, will lead subsequent damage or generation more unpredictable disbond. In addition, detection...
Information Retrieval (IR) systems are crucial tools for users to access information, widely applied in scenarios like search engines, question answering, and recommendation systems. Traditional IR methods, based on similarity matching return ranked lists of documents, have been reliable means information acquisition, dominating the field years. With advancement pre-trained language models, generative retrieval (GenIR) has emerged as a novel paradigm, gaining increasing attention recent...
Large language models (LLMs) exhibit remarkable generative capabilities but often suffer from hallucinations. Retrieval-augmented generation (RAG) offers an effective solution by incorporating external knowledge, existing methods still face several limitations: additional deployment costs of separate retrievers, redundant input tokens retrieved text chunks, and the lack joint optimization retrieval generation. To address these issues, we propose \textbf{RetroLLM}, a unified framework that...
Developing a BIM-Based Integrated Model for CAD to CAM Production Automation Xiaoxi Li, Ahmed Qureshi and Mohamed Al-Hussein Pages 51-58 (2017 Proceedings of the 34rd ISARC, Taipei, Taiwan, ISBN 978-80-263-1371-7, ISSN 2413-5844) Abstract: Modular construction has gained momentum in North America as an emerging paradigm recent years. buildings are assembled from components that prefabricated manufacturing plants transported site assembly. The current manual-based approach modular...
Multi-layer metal-metal bonding structure is widely applied in aviation, aerospace, and nuclear industrial fields. Debonding defects retains high attention Nondestructive testing evaluation society. This paper proposes the feasibility study for inner debonding defect detection of lead-steel by using eddy current pulsed thermography. Numerical has been conducted validation studies on detectability sensitivity curve versus effect excitation heating time are reported. According to numerical...
The advent of large language models (LLMs) has showcased their efficacy across various domains, yet they often hallucinate, especially in knowledge-intensive tasks that require external knowledge sources. To improve factual accuracy models, retrieval-augmented generation (RAG) emerged as a popular solution. However, traditional retrieval modules rely on large-scale document indexes, which can be disconnected from generative tasks. Through (GR) approach, achieve superior performance by...
Abstract Liquid metal reactors (LMR), with the significant advantages in high safety and good economic benefits, has obvious broad application prospects fourth generation nuclear power systems. Spiral tube steam generator is one of most important equipment LMR, which are composed spiral heat transfer tubes, inner outer cylinders, feedwater headers, other structures. Due to their compact structure efficiency, they conducive miniaturization have been widely used different various countries....
In obtaining Digital Elevation Model (DEM), most methods of acquiring the tie points are generated automatically by software and then manually screened, which is time-consuming labor-intensive, accuracy cannot be guaranteed. Therefore, this paper proposes an automatic stereo matching method combining Speeded Up Robust Features (SURF) Rational Function (RFM) to reconstruct 3D model remote sensing generalized image pairs. There two main tasks: first, apply SURF algorithm images, screen at same...
In order to meet the requirements of data link cooperative operation, a terminal based on Tianlian relay satellite and Beidou short message dual-modes communications is proposed. mode, methods random insertion empty frames fast antenna switching are adopted ensure integrity image data, antennal time no more than 10us at high speed rate. adds function broadcast transmission basis traditional point-to-point communication, can automatically identify type user machine, which not only compatible...