NFDI4DS | UHH-SEMS - Publication Details

Shuangyin Li

ORCID: 0000-0001-6404-3438

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5079920323

Research Areas

Topic Modeling
Natural Language Processing Techniques
Advanced Text Analysis Techniques
Domain Adaptation and Few-Shot Learning
Advanced Graph Neural Networks
Genomic variations and chromosomal abnormalities
Prenatal Screening and Diagnostics
Multimodal Machine Learning Applications
Text and Document Classification Technologies
Speech and dialogue systems
Data Quality and Management
Speech Recognition and Synthesis
Security and Verification in Computing
Data Mining Algorithms and Applications
Gene expression and cancer classification
Advanced Malware Detection Techniques
Web Data Mining and Analysis
Semantic Web and Ontologies
Recommender Systems and Techniques
Blockchain Technology Applications and Security
Cancer Genomics and Diagnostics
Epigenetics and DNA Methylation
Cloud Computing and Resource Management
Service-Oriented Architecture and Web Services
Computational and Text Analysis Methods

South China Normal University
2019-2025

University of Hong Kong
2016-2023

Hong Kong University of Science and Technology
2016-2023

Key Laboratory of Guangdong Province
2022

Guangdong University of Foreign Studies
2019

National Police Academy
2016

Sun Yat-sen University
2013

Incorporating GAN for Negative Sampling in Knowledge Representation Learning

OPENALEX - Publications

Peifeng Wang Shuangyin Li Rong Pan

Knowledge representation learning aims at modeling knowledge graph by encoding entities and relations into a low dimensional space. Most of the traditional works for embedding need negative sampling to minimize margin-based ranking loss. However, those construct samples through random mode, which are often too trivial fit model efficiently. In this paper, we propose novel framework based on Generative Adversarial Networks (GAN). GAN-based framework, take advantage generator obtain...

10.1609/aaai.v32i1.11536 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-25

Personalizing a Dialogue System With Transfer Reinforcement Learning

OPENALEX - Publications

Kaixiang Mo Yu Zhang Shuangyin Li Jiajun Li Qiang Yang

It is difficult to train a personalized task-oriented dialogue system because the data collected from each individual often insufficient. Personalized systems trained on small dataset likely overfit and make it adapt different user needs. One way solve this problem consider collection of multiple users as source domain an target domain, perform transfer learning domain. By following idea, we propose PErsonalized Task-oriented diALogue (PETAL) system, reinforcement framework based POMDP,...

10.1609/aaai.v32i1.11938 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2018-04-27

A Dataset for Investigations of Amine-Impregnated Solid Adsorbent for Direct Air Capture

OPENALEX - Publications

Eryu Wang Liping Luo Jiachuan Wang Jian Dai Shuangyin Li and 2 more

10.1038/s41597-025-05037-1 article EN cc-by-nc-nd Scientific Data 2025-05-01

Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning

OPENALEX - Publications

Heng Wang Shuangyin Li Rong Pan Mingzhi Mao

Heng Wang, Shuangyin Li, Rong Pan, Mingzhi Mao. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1264 article EN cc-by 2019-01-01

Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modeling

OPENALEX - Publications

Shuangyin Li Rong Pan Haoyu Luo Xiao Liu Gansen Zhao

Because of its efficiency, word embedding has been widely used in many natural language processing and text modeling tasks. It aims to represent each by a vector so such that the geometry between these vectors can capture semantic correlations words. An ambiguous often have diverse meanings different contexts, quality which is called polysemy. The bulk studies aimed generate only one single for word, whereas few made small number embeddings present word. However, it hard determine exact...

10.1016/j.knosys.2021.106827 article EN cc-by-nc-nd Knowledge-Based Systems 2021-02-21

EtherGIS: A Vulnerability Detection Framework for Ethereum Smart Contracts Based on Graph Learning Features

OPENALEX - Publications

Qingren Zeng Jiahao He Gansen Zhao Shuangyin Li Jingji Yang and 2 more

The financial property of Ethereum makes smart contract attacks frequently bring about tremendous economic loss. Method for effective detection vulnerabilities in contracts imperative. Existing efforts security analysis heavily rely on rigid rules defined by experts, which are labor-intensive and non-scalable. There is still a lack effort that considers combining expert-defined patterns with deep learning. This paper proposes EtherGIS, vulnerability framework utilizes graph neural networks...

10.1109/compsac54236.2022.00277 article EN 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC) 2022-06-01

Incorporating GAN for Negative Sampling in Knowledge Representation Learning

OPENALEX - Publications

Peifeng Wang Shuangyin Li Rong Pan

10.48550/arxiv.1809.11017 preprint EN other-oa arXiv (Cornell University) 2018-01-01

KeepEdge: A Knowledge Distillation Empowered Edge Intelligence Framework for Visual Assisted Positioning in UAV Delivery

OPENALEX - Publications

Haoyu Luo Tianxiang Chen Xuejun Li Shuangyin Li Chong Zhang and 2 more

The Unmanned Aerial Vehicles (UAVs) delivery service is being increasingly used in logistics. However, it challenging for a UAV to precisely identify the position parcel delivering if only aided by GPS, especially some complex environments with weak signals and high interference. For this issue, we present knowledge distillation empowered edge intelligence architecture, KeepEdge, achieve visual information-assisted positioning services. Specifically, integrate deep neural networks (DNN) into...

10.1109/tmc.2022.3157957 article EN cc-by IEEE Transactions on Mobile Computing 2022-03-09

On Leveraging Large Language Models for Enhancing Entity Resolution

OPENALEX - Publications

H. Li Longyu Feng Shuangyin Li Fei Hao Chen Zhang and 2 more

Entity resolution, the task of identifying and consolidating records that pertain to same real-world entity, plays a pivotal role in various sectors such as e-commerce, healthcare, law enforcement. The emergence Large Language Models (LLMs) like GPT-4 has introduced new dimension this task, leveraging their advanced linguistic capabilities. This paper explores potential LLMs entity resolution process, shedding light on both advantages computational complexities associated with large-scale...

10.48550/arxiv.2401.03426 preprint EN cc-by arXiv (Cornell University) 2024-01-01

Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference

OPENALEX - Publications

Qingfa Xiao Jiachuan Wang Haoyang Li Cheng Deng Jiaqi Tang and 4 more

Recent advances in large language models (LLMs) have showcased exceptional performance long-context tasks, while facing significant inference efficiency challenges with limited GPU memory. Existing solutions first proposed the sliding-window approach to accumulate a set of historical \textbf{key-value} (KV) pairs for reuse, then further improvements selectively retain its subsets at each step. However, due sparse attention distribution across long context, it is hard identify and recall...

10.48550/arxiv.2502.13542 preprint EN arXiv (Cornell University) 2025-02-19

A novel chromosome cluster types identification method using ResNeXt WSL model

OPENALEX - Publications

Chengchuang Lin Gansen Zhao Aihua Yin Zhirong Yang Li Guo and 5 more

10.1016/j.media.2020.101943 article EN Medical Image Analysis 2020-12-25

bi-directional Bayesian probabilistic model based hybrid grained semantic matchmaking for Web service discovery

OPENALEX - Publications

Shuangyin Li Haoyu Luo Gansen Zhao Mingdong Tang Liu Xiao

Abstract Web service discovery is a fundamental task in service-oriented architectures which searches for suitable web services based on users’ goals and preferences. In this paper, we present novel approach that can support user queries with various-size-grained text elements. Compared existing approaches only semantics matchmaking single texture granularity (either word level or paragraph level), our enables the requester to search any type of query content high performance, including...

10.1007/s11280-022-01004-7 article EN cc-by World Wide Web 2022-02-17

MemoryPath: A deep reinforcement learning framework for incorporating memory component into knowledge graph reasoning

OPENALEX - Publications

Shuangyin Li Heng Wang Rong Pan Mingzhi Mao

10.1016/j.neucom.2020.08.032 article EN Neurocomputing 2020-08-29

Bi-Directional Recurrent Attentional Topic Model

OPENALEX - Publications

Shuangyin Li Yu Zhang Rong Pan

In a document, the topic distribution of sentence depends on both topics its neighbored sentences and own content, it is usually affected by with different weights. The include preceding subsequent sentences. Meanwhile, natural that document can be treated as sequence Most existing works for Bayesian modeling do not take these points into consideration. To fill this gap, we propose bi-Directional Recurrent Attentional Topic Model (bi-RATM) embedding. bi-RATM only takes advantage sequential...

10.1145/3412371 article EN ACM Transactions on Knowledge Discovery from Data 2020-09-28

Length Adaptive Recurrent Model for Text Classification

OPENALEX - Publications

Zhengjie Huang Zi Ye Shuangyin Li Rong Pan

In recent years, recurrent neural networks have been widely used for various text classification tasks. However, most of the architectures will not assign a class label to until they read last word, while human beings are able determine before reading whole text. this paper, we propose Length Adaptive Recurrent Model (LARM) which can automatically minimum length that is necessary perform classification. With three parts includingReader, Predictor andAgent, our model designed word by and...

10.1145/3132847.3132947 article EN 2017-11-06

Prediction of COVID-19 Using a WOA-BILSTM Model

OPENALEX - Publications

Xinyue Yang Shuangyin Li

The COVID-19 pandemic has had a significant impact on the world, highlighting importance of accurate prediction infection numbers. Given that transmission SARS-CoV-2 is influenced by temporal and spatial factors, numerous researchers have employed neural networks to address this issue. Accordingly, we propose whale optimization algorithm-bidirectional long short-term memory (WOA-BILSTM) model for predicting cumulative confirmed cases. In model, initially input regional epidemic data,...

10.3390/bioengineering10080883 article EN cc-by Bioengineering 2023-07-25

Mashup-oriented API recommendation via pre-trained heterogeneous information networks

OPENALEX - Publications

Mingdong Tang Fenfang Xie Sixian Lian J. Mai Shuangyin Li

10.1016/j.infsof.2024.107428 article EN Information and Software Technology 2024-02-28

Tag-Weighted Dirichlet Allocation

OPENALEX - Publications

Shuangyin Li Guan Huang Ruiyang Tan Rong Pan

In the past two decades, there has been a huge amount of document data with rich tag information during evolution Internet, which can be called semi-structured data. These contain both unstructured features (e.g., plain text) and metadata, such as tags in html files or author venue research articles. It's great interest to model kind Most previous works focused on modeling Some other methods have proposed specific tags. To build general for documents remains an important problem terms...

10.1109/icdm.2013.11 article EN 2013-12-01

Personalizing a Dialogue System with Transfer Reinforcement Learning

OPENALEX - Publications

Kaixiang Mo Shuangyin Li Yu Zhang Jiajun Li Qiang Yang

It is difficult to train a personalized task-oriented dialogue system because the data collected from each individual often insufficient. Personalized systems trained on small dataset can overfit and make it adapt different user needs. One way solve this problem consider collection of multiple users' as source domain an user's target domain, perform transfer learning domain. By following idea, we propose "PETAL"(PErsonalized Task-oriented diALogue), transfer-learning framework based POMDP...

10.48550/arxiv.1610.02891 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Recurrent Attentional Topic Model

OPENALEX - Publications

Shuangyin Li Yu Zhang Rong Pan Mingzhi Mao Yang Yang

In a document, the topic distribution of sentence depends on both topics preceding sentences and its own content, it is usually affected by with different weights. It natural that document can be treated as sequence sentences. Most existing works for Bayesian modeling do not take these points into consideration. To fill this gap, we propose Recurrent Attentional Topic Model (RATM) embedding. The RATM only takes advantage sequential orders among but also use attention mechanism to model...

10.1609/aaai.v31i1.10972 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2017-02-12

ChromosomeNet: A massive dataset enabling benchmarking and building basedlines of clinical chromosome classification

OPENALEX - Publications

Chengchuang Lin Hanbiao Chen Jie‐Sheng Huang Jing Peng Li Guo and 5 more

10.1016/j.compbiolchem.2022.107731 article EN Computational Biology and Chemistry 2022-07-16

Spring-back analysis in the cold-forming process of ship hull plates

OPENALEX - Publications

Wei Shen Renjun Yan Shuangyin Li Lin Xu

10.1007/s00170-018-1741-3 article EN The International Journal of Advanced Manufacturing Technology 2018-02-21

Usr-mtl: an unsupervised sentence representation learning framework with multi-task learning

OPENALEX - Publications

Wenshen Xu Shuangyin Li Yonghe Lu

10.1007/s10489-020-02042-2 article EN Applied Intelligence 2020-11-14

Coming Soon ...