NFDI4DS | UHH-SEMS - Publication Details

Li Dong

ORCID: 0000-0003-3083-7170

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101751775

Research Areas

Natural Language Processing Techniques
Topic Modeling
Multimodal Machine Learning Applications
Advanced Text Analysis Techniques
Advanced Wireless Network Optimization
Advanced MIMO Systems Optimization
Wireless Networks and Protocols
Complex Network Analysis Techniques
Text and Document Classification Technologies
Sensor Technology and Measurement Systems
Advanced Neural Network Applications
Advanced Electrical Measurement Techniques
Domain Adaptation and Few-Shot Learning
Neural Networks and Applications
Ferroelectric and Negative Capacitance Devices
Video Analysis and Summarization
Seismic Waves and Analysis
Advanced Graph Neural Networks
E-commerce and Technology Innovations
Higher Education and Teaching Methods
Geophysical Methods and Applications
Traffic Prediction and Management Techniques
Web Data Mining and Analysis
Caching and Content Delivery
Time Series Analysis and Forecasting

Tianjin University
2020-2025

William & Mary
2025

Microsoft Research Asia (China)
2024

Peking University
2022

Microsoft Research (India)
2022

University of Edinburgh
2016-2019

Peng Cheng Laboratory
2019

Beijing Information Science & Technology University
2010-2013

Dali University
2012

China Electronic Product Reliability and Environmental Test Institute
2010

Data-to-Text Generation with Content Selection and Planning

OPENALEX - Publications

Ratish Puduppully Li Dong Mirella Lapata

Recent advances in data-to-text generation have led to the use of large-scale datasets and neural network models which are trained end-to-end, without explicitly modeling what say order. In this work, we present a architecture incorporates content selection planning sacrificing end-to-end training. We decompose task into two stages. Given corpus data records (paired with descriptive documents), first generate plan highlighting information should be mentioned order then document while taking...

10.1609/aaai.v33i01.33016908 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Long Short-Term Memory-Networks for Machine Reading

OPENALEX - Publications

Jianpeng Cheng Li Dong Mirella Lapata

In this paper we address the question of how to render sequence-level networks better at handling structured input. We propose a machine reading simulator which processes text incrementally from left right and performs shallow reasoning with memory attention. The reader extends Long Short-Term Memory architecture network in place single cell. This enables adaptive usage during recurrence neural attention, offering way weakly induce relations among tokens. system is initially designed process...

10.48550/arxiv.1601.06733 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective

OPENALEX - Publications

Ursula Challita Li Dong Walid Saad

Performing cellular long term evolution (LTE) communications in unlicensed spectrum using licensed assisted access LTE (LTE-LAA) is a promising approach to overcome wireless scarcity. However, reap the benefits of LTE-LAA, fair coexistence mechanism with other incumbent WiFi deployments required. In this paper, novel deep learning proposed for modeling resource allocation problem LTE-LAA small base stations (SBSs). The enables multiple SBSs proactively perform dynamic channel selection,...

10.1109/twc.2018.2829773 article EN publisher-specific-oa IEEE Transactions on Wireless Communications 2018-05-15

Retentive Network: A Successor to Transformer for Large Language Models

OPENALEX - Publications

Yutao Sun Li Dong Shaohan Huang Shuming Ma Yuqing Xia and 3 more

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection between recurrence attention. Then retention mechanism sequence modeling, which supports three computation paradigms, i.e., parallel, recurrent, chunkwise recurrent. Specifically, parallel representation allows parallelism. The recurrent enables $O(1)$...

10.48550/arxiv.2307.08621 preprint EN other-oa arXiv (Cornell University) 2023-01-01

DeepNet: Scaling Transformers to 1,000 Layers

OPENALEX - Publications

Hongyu Wang Shuming Ma Li Dong Shaohan Huang Dongdong Zhang and 1 more

In this paper, we propose a simple yet effective method to stabilize extremely deep Transformers. Specifically, introduce new normalization function ( <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">DeepNorm</small> ) modify the residual connection in Transformer, accompanying with theoretically derived initialization. In-depth theoretical analysis shows that model updates can be bounded stable way. The proposed combines best of two worlds, i.e.,...

10.1109/tpami.2024.3386927 article EN cc-by IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-04-10

Language to Logical Form with Neural Attention

OPENALEX - Publications

Li Dong Mirella Lapata

Semantic parsing aims at mapping natural language to machine interpretable meaning representations. Traditional approaches rely on high-quality lexicons, manually-built templates, and linguistic features which are either domain- or representation-specific. In this paper we present a general method based an attention-enhanced encoder-decoder model. We encode input utterances into vector representations, generate their logical forms by conditioning the output sequences trees encoding vectors....

10.48550/arxiv.1601.01280 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Smoothed LSTM-AE: A spatio-temporal deep model for multiple time-series missing imputation

OPENALEX - Publications

Li Dong Linhao Li Xianling Li Zhiwu Ke Qinghua Hu

10.1016/j.neucom.2020.05.033 article EN Neurocomputing 2020-05-27

Learning to Ask Unanswerable Questions for Machine Reading Comprehension

OPENALEX - Publications

Haichao Zhu Li Dong Furu Wei Wenhui Wang Bing Qin and 1 more

Machine reading comprehension with unanswerable questions is a challenging task. In this work, we propose data augmentation technique by automatically generating relevant according to an answerable question paired its corresponding paragraph that contains the answer. We introduce pair-to-sequence model for generation, which effectively captures interactions between and paragraph. also present way construct training our generation models leveraging existing dataset. Experimental results show...

10.18653/v1/p19-1415 preprint EN cc-by 2019-01-01

Coarse-to-Fine Decoding for Neural Semantic Parsing

OPENALEX - Publications

Li Dong Mirella Lapata

Semantic parsing aims at mapping natural language utterances into structured meaning representations. In this work, we propose a structure-aware neural architecture which decomposes the semantic process two stages. Given an input utterance, first generate rough sketch of its meaning, where low-level information (such as variable names and arguments) is glossed over. Then, fill in missing details by taking account itself. Experimental results on four datasets characteristic different domains...

10.48550/arxiv.1805.04793 preprint EN other-oa arXiv (Cornell University) 2018-01-01

StableMoE: Stable Routing Strategy for Mixture of Experts

OPENALEX - Publications

Damai Dai Li Dong Shuming Ma Bo Zheng Zhifang Sui and 2 more

The Mixture-of-Experts (MoE) technique can scale up the model size of Transformers with an affordable computational overhead. We point out that existing learning-to-route MoE methods suffer from routing fluctuation issue, i.e., target expert same input may change along training, but only one will be activated for during inference. tends to harm sample efficiency because updates different experts is finally used. In this paper, we propose StableMoE two training stages address problem. first...

10.18653/v1/2022.acl-long.489 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

GDDA: Semantic OOD Detection on Graphs under Covariate Shift via Score-Based Diffusion Models

OPENALEX - Publications

Zhixia He Chen Zhao Minglai Shao Yujie Lin Li Dong and 1 more

10.1109/icassp49660.2025.10890879 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Machine Learning-Guided Memory Optimization for DLRM Inference on Tiered Memory

OPENALEX - Publications

Jie Ren Bin Ma Shuangyan Yang Benjamin Francis Ehsan K. Ardestani and 2 more

10.1109/hpca61900.2025.00121 article EN 2025-03-01

Application of audio magnetotelluric method in detecting buried faults in water diversion tunnels

OPENALEX - Publications

Mingcai Zhang Guanghong Ju Xiaofan An Zonggang Chen Li Dong and 1 more

Abstract Diversion tunnels play a critical role in water conservancy and hydropower projects. However, due to complex geological conditions, especially the influence of buried fault structures that are difficult observe below surface directly, construction processes often face significant challenges such as rock mass instability, seepage, abrupt changes. Audio magnetotelluric (AMT) technology, high-resolution electromagnetic exploration method, demonstrates remarkable advantages detecting...

10.1088/1742-6596/2990/1/012003 article EN Journal of Physics Conference Series 2025-04-01

Learning to Paraphrase for Question Answering

OPENALEX - Publications

Li Dong Jonathan Mallinson Siva Reddy Mirella Lapata

Question answering (QA) systems are sensitive to the many different ways natural language expresses same information need. In this paper we turn paraphrases as a means of capturing knowledge and present general framework which learns felicitous for various QA tasks. Our method is trained end-to-end using question-answer pairs supervision signal. A question its serve input neural scoring model assigns higher weights linguistic expressions most likely yield correct answers. We evaluate our...

10.48550/arxiv.1708.06022 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Visualizing and Understanding the Effectiveness of BERT

OPENALEX - Publications

Yaru Hao Li Dong Furu Wei Ke Xu

Language model pre-training, such as BERT, has achieved remarkable results in many NLP tasks. However, it is unclear why the pre-training-then-fine-tuning paradigm can improve performance and generalization capability across different In this paper, we propose to visualize loss landscapes optimization trajectories of fine-tuning BERT on specific datasets. First, find that pre-training reaches a good initial point downstream tasks, which leads wider optima easier compared with training from...

10.48550/arxiv.1908.05620 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Solving and Generating Chinese Character Riddles

OPENALEX - Publications

Chuanqi Tan Furu Wei Li Dong Weifeng Lv Ming Zhou

Chinese character riddle is a game in which the solution single character.It closely connected with shape, pronunciation or meaning of characters.The description (sentence) usually composed phrases rich linguistic phenomena (such as pun, simile, and metaphor), are associated to different parts (namely radicals) character.In this paper, we propose statistical framework solve generate riddles.Specifically, learn alignments rules identify metaphors between riddles radicals characters.Then,...

10.18653/v1/d16-1081 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

A Trust Evaluation Model for Web Service Selection

OPENALEX - Publications

Li Dong Chunhui Yang

Service descriptions via Web Services Description Language (WSDL) are necessary but not sufficient to service selection based on trust. we need a means collect nonfunctional information about services and use that assign dynamic trust levels the providers implementations. In this paper, briefly discuss problem of selection, given some correlation definitions, proposed evaluation model for resolution above question.

10.1109/iitsi.2010.175 article EN 2010-04-01

Generic-to-Specific Distillation of Masked Autoencoders

OPENALEX - Publications

Huang Wei Zhiliang Peng Li Dong Furu Wei Qixiang Ye and 1 more

To transfer the representation capacity of large pre-trained models to lightweight models, knowledge distillation has been widely explored. However, conventional single-stage methods are prone getting stuck in task-specific knowledge, making it difficult retain task-agnostic which is crucial for model generalization. In this study, we propose generic-to-specific (G2SD), boost under assistance by masked image modeling. generic distillation, decoder a small encouraged align feature predictions...

10.1109/tcsvt.2024.3393474 article EN IEEE Transactions on Circuits and Systems for Video Technology 2024-04-25

Coming Soon ...