NFDI4DS | UHH-SEMS - Publication Details

Peng Li

ORCID: 0000-0003-1374-5979

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100726757

Research Areas

Topic Modeling
Advanced Computational Techniques and Applications
Advanced Text Analysis Techniques
Natural Language Processing Techniques
Speech Recognition and Synthesis
Web Data Mining and Analysis
Educational Technology and Assessment
Complex Network Analysis Techniques
Speech and Audio Processing
Multimodal Machine Learning Applications
Advanced Database Systems and Queries
Domain Adaptation and Few-Shot Learning
Music and Audio Processing
Data Management and Algorithms
Text and Document Classification Technologies
Machine Learning and Algorithms
3D Modeling in Geospatial Applications
Intelligent Tutoring Systems and Adaptive Learning
Sentiment Analysis and Opinion Mining
Recommender Systems and Techniques
Advanced Clustering Algorithms Research
Misinformation and Its Impacts
Caching and Content Delivery
Experimental Learning in Engineering
Wireless Signal Modulation Classification

Tsinghua University
2009-2024

The Synergetic Innovation Center for Advanced Materials
2023-2024

Beijing Academy of Artificial Intelligence
2023-2024

Shanghai Artificial Intelligence Laboratory
2023-2024

Nanjing University of Information Science and Technology
2024

East Carolina University
2024

Beijing Institute of Technology
2022

Harbin Institute of Technology
2022

Shenyang Jianzhu University
2011-2020

Ostwestfalen-Lippe University of Applied Sciences and Arts
2018

Clustering to find exemplar terms for keyphrase extraction

OPENALEX - Publications

Zhiyuan Liu Peng Li Yabin Zheng Maosong Sun

Keyphrases are widely used as a brief summary of documents. Since manual assignment is time-consuming, various unsupervised ranking methods based on importance scores proposed for keyphrase extraction. In practice, the keyphrases document should not only be statistically important in document, but also have good coverage document. Based this observation, we propose an method Firstly, finds exemplar terms by leveraging clustering techniques, which guarantees to semantically covered these...

10.3115/1699510.1699544 article EN 2009-01-01

Redesigning Cyber Security Labs with Immediate Feedback

OPENALEX - Publications

Peng Li

Abstract We revamped hands-on labs in a cyber security course to be aligned with our college's new initiatives increase accessibility utilizing "ed-tech" (cloud services, etc.) and use of learning management systems for real-time assessment student intervention needs, program outcomes continuous improvement. The field is evolving fast. current lab environment were outdated. In this project, virtual are updated or recreated. used employ an old consisting three individual machines. composed...

10.18260/1-2--41300 article EN 2024-02-06

Black-Box Prompt Tuning With Subspace Learning

OPENALEX - Publications

Yuanhang Zheng Zhixing Tan Peng Li Yang Liu

Black-box prompt tuning employs derivative-free optimization algorithms to learn prompts within low-dimensional subspaces rather than back-propagating through the network of Large Language Models (LLMs).Recent studies reveal that black-box lacks versatility across tasks and LLMs, which we believe is related suboptimal choice subspaces.In this paper, introduce with Subspace Learning (BSL) enhance tuning.Based on assumption nearly optimal for similar reside in a common subspace, propose...

10.1109/taslp.2024.3407519 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2024-01-01

Exploring Universal Intrinsic Task Subspace for Few-Shot Learning via Prompt Tuning

OPENALEX - Publications

Yujia Qin Xiaozhi Wang Yusheng Su Yankai Lin Ning Ding and 8 more

Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to broad NLP tasks differing a lot superficially? In this work, we empirically find evidence indicating that the adaptations of PLMs various few-shot be reparameterized as optimizing only few free parameters in unified low-dimensional <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">intrinsic task subspace</i> , which may help us understand why...

10.1109/taslp.2024.3430545 article EN cc-by-nc-nd IEEE/ACM Transactions on Audio Speech and Language Processing 2024-01-01

Learning to Relate to Previous Turns in Conversational Search

OPENALEX - Publications

Fengran Mo Jian‐Yun Nie Kaiyu Huang Kelong Mao Yutao Zhu and 2 more

Conversational search allows a user to interact with system in multiple turns. A query is strongly dependent on the conversation context. An effective way improve retrieval effectiveness expand current historical queries. However, not all previous queries are related to, and useful for expanding query. In this paper, we propose new method select relevant that To cope lack of labeled training data, use pseudo-labeling approach annotate based their impact results. The pseudo-labeled data used...

10.1145/3580305.3599411 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2023-08-04

DocRED: A Large-Scale Document-Level Relation Extraction Dataset

OPENALEX - Publications

Yuan Yao Deming Ye Peng Li Xu Han Yankai Lin and 5 more

Multiple entities in a document generally exhibit complex inter-sentence relations, and cannot be well handled by existing relation extraction (RE) methods that typically focus on extracting intra-sentence relations for single entity pairs. In order to accelerate the research document-level RE, we introduce DocRED, new dataset constructed from Wikipedia Wikidata with three features: (1) DocRED annotates both named is largest human-annotated RE plain text; (2) requires reading multiple...

10.48550/arxiv.1906.06127 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Adaptive design and implementation of automatic modulation recognition accelerator

OPENALEX - Publications

Bin Wang Xianglin Wei Chao Wang Junnan Li Xiang Jiao and 2 more

10.1007/s12652-023-04736-0 article EN Journal of Ambient Intelligence and Humanized Computing 2024-01-01

A Geometric Approach to Clustering Based Anomaly Detection for Industrial Applications

OPENALEX - Publications

Peng Li Oliver Niggemann Barbara Hammer

Recent clustering based anomaly detection technologies classify new observations in different ways, e.g. using probability distributions, cluster centers or whole data points. Some of which suffer from high false classification rate, while others require computational resources. In this paper, we propose a geometric approach to detection, the boundaries clusters are utilized instead. To identify boundaries, algorithm for generating n-dimensional non-convex hulls has been developed. The...

10.1109/iecon.2018.8592906 article EN IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society 2018-10-01

Monaural voiced speech segregation based on elaborate harmonic grouping strategy

OPENALEX - Publications

Xueliang Zhang Wenju Liu Peng Li Bo Xu

Monaural speech segregation is a very challenging problem which has been studied by many researchers. In this paper, we focus on voiced segregation. Different strategies are used to segregate resolved and unresolved harmonics respectively. For harmonics, "harmonicity" principle novel mechanism based "minimum amplitude" employed. Amplitude modulation rate extracted "enhanced" autocorrelation function of envelope more robust than previous method. An elaborate rule also introduced determine the...

10.1109/icassp.2009.4960670 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2009-04-01

Dynamical correlation between quantum entanglement and intramolecular energy in molecular vibrations: An algebraic approach

OPENALEX - Publications

Hairan Feng Xiangjia Meng Peng Li Yujun Zheng

The dynamical correlation between quantum entanglement and intramolecular energy in realistic molecular vibrations is explored using the Lie algebraic approach. explicit expression of measurement can be achieved operations. common different characteristics are also provided. study small helpful for controlling further understanding dynamics.

10.1088/1674-1056/23/7/073301 article EN Chinese Physics B 2014-07-01

An active learning method based on mistake sampling for large scale imbalanced classification

OPENALEX - Publications

Jia Guo Xin Wan Hao Lin Peng Li Guannan Liu and 1 more

Nowadays, the challenge of learning from large scale and imbalanced data set have attracted a great deal attention both industry academia, which is also deemed to be an important task for fraud detection in telecommunication, finance, online commerce. In general, it's almost impossible train classification model on complete set, especially era big data, due space-time complexity. Thus, how sample training original large-scale that can provide more accurate prediction result has become focal...

10.1109/icsssm.2017.7996301 article EN International Conference on Service Systems and Service Management 2017-06-01

New methods for the construction of Voronoi Diagram and the nearest neighbor query

OPENALEX - Publications

Song Li Liping Zhang Peng Li Deyun Chen

The existing methods for the construction of Voronoi Diagram and nearest neighbor query have several disadvantages. In view disadvantages, new were studied in detail. To construct diagram effectively, datasets grouped advance, then divide-and-conquer method combined with increment was used to generate Delaunay triangulation. By generating from triangulation, Creat_Vor algorithm proposed. order effectively update neighbors given points, based on spatial grids studied. VGride_NN VGride_BNN put...

10.1109/ifost.2014.6991116 article EN 2014-10-01

Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification

OPENALEX - Publications

Deli Chen Yankai Lin Lei Li Xuancheng Ren Peng Li and 2 more

Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC). However, existing GCL methods are generally transferred from other fields like CV or NLP, whose underlying working mechanism remains underexplored. In this work, we first deeply probe SSNC, and find that promotion brought by is severely unevenly distributed: improvement mainly comes subgraphs with less annotated information, which fundamentally different...

10.24963/ijcai.2022/395 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

DataGpt-SQL-7B: An Open-Source Language Model for Text-to-SQL

OPENALEX - Publications

Lixia Wu Peng Li Junhong Lou Lei Fu

In addressing the pivotal role of translating natural language queries into SQL commands, we propose a suite compact, fine-tuned models and self-refine mechanisms to democratize data access analysis for non-expert users, mitigating risks associated with closed-source Large Language Models. Specifically, constructed dataset over 20K sample Text-to-SQL as well preference dateset, improve efficiency in domain generation. To further ensure code validity, corrector was integrated model. Our...

10.48550/arxiv.2409.15985 preprint EN arXiv (Cornell University) 2024-09-24

Black-box Prompt Tuning with Subspace Learning

OPENALEX - Publications

Yuanhang Zheng Zhixing Tan Peng Li Yang Liu

Black-box prompt tuning employs derivative-free optimization algorithms to learn prompts within low-dimensional subspaces rather than back-propagating through the network of Large Language Models (LLMs). Recent studies reveal that black-box lacks versatility across tasks and LLMs, which we believe is related suboptimal choice subspaces. In this paper, introduce with Subspace Learning (BSL) enhance tuning. Based on assumption nearly optimal for similar reside in a common subspace, propose...

10.48550/arxiv.2305.03518 preprint EN other-oa arXiv (Cornell University) 2023-01-01

AdaDS: Adaptive data selection for accelerating pre-trained language model knowledge distillation

OPENALEX - Publications

Qinhong Zhou Peng Li Yang Liu Yuyang Guan Qizhou Xing and 3 more

Knowledge distillation (KD) is a widely used method for transferring knowledge from large teacher models to computationally efficient student models. Unfortunately, the computational cost of KD becomes unaffordable as pre-trained language (PLMs) grow larger. Computing loss on only part training set promising way accelerate KD. However, existing works heuristically leverage one static data selection strategy during process, demonstrating inconsistent improvements across different scenarios....

10.1016/j.aiopen.2023.08.005 article EN cc-by-nc-nd AI Open 2023-01-01

Restricted orthogonal gradient projection for continual learning

OPENALEX - Publications

Zeyuan Yang Zonghan Yang Yichen Liu Peng Li Yang Liu

Continual learning aims to avoid catastrophic forgetting and effectively leverage learned experiences master new knowledge. Existing gradient projection approaches impose hard constraints on the optimization space for tasks minimize interference, which simultaneously hinders forward knowledge transfer. To address this issue, recent methods reuse frozen parameters with a growing network, resulting in high computational costs. Thus, it remains challenge whether we can improve transfer using...

10.1016/j.aiopen.2023.08.010 article EN cc-by-nc-nd AI Open 2023-01-01

ToolRerank: Adaptive and Hierarchy-Aware Reranking for Tool Retrieval

OPENALEX - Publications

Yuanhang Zheng Peng Li Wei Liu Yang Liu Jian Luan and 1 more

Tool learning aims to extend the capabilities of large language models (LLMs) with external tools. A major challenge in tool is how support a number tools, including unseen To address this challenge, previous studies have proposed retrieving suitable tools for LLM based on user query. However, previously methods do not consider differences between seen and nor they take hierarchy library into account, which may lead suboptimal performance retrieval. Therefore, aforementioned issues, we...

10.48550/arxiv.2403.06551 preprint EN arXiv (Cornell University) 2024-03-11

Coming Soon ...