Yihua Huang

ORCID: 0000-0003-1806-0936
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Graph Theory and Algorithms
  • Cloud Computing and Resource Management
  • Advanced Graph Neural Networks
  • Topic Modeling
  • Caching and Content Delivery
  • Web Data Mining and Analysis
  • Advanced Data Storage Technologies
  • Data Management and Algorithms
  • Advanced Neural Network Applications
  • Natural Language Processing Techniques
  • Recommender Systems and Techniques
  • IoT and Edge/Fog Computing
  • Parallel Computing and Optimization Techniques
  • Advanced Database Systems and Queries
  • Advanced Text Analysis Techniques
  • Advanced Image and Video Retrieval Techniques
  • Data Mining Algorithms and Applications
  • Scientific Computing and Data Management
  • Domain Adaptation and Few-Shot Learning
  • Text and Document Classification Technologies
  • Adversarial Robustness in Machine Learning
  • Machine Learning and Data Classification
  • Semantic Web and Ontologies
  • Brain Tumor Detection and Classification
  • Algorithms and Data Compression

Nanjing University
2016-2025

Nanjing University of Science and Technology
2009-2024

Sun Yat-sen University
2019-2023

Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai)
2023

Nanchang University
2021-2023

Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou)
2023

Lishui City People's Hospital
2022

Xian Yang Central Hospital
2022

China Telecom (China)
2022

China Telecom
2022

The frequent itemset mining (FIM) is one of the most important techniques to extract knowledge from data in many real-world applications. Apriori algorithm widely-used for itemsets a transactional dataset. However, FIM process both data-intensive and computing-intensive. On side, large scale sets are usually adopted nowadays, on other order generate valid information, needs scan datasets iteratively times. These make very time-consuming over big data. parallel distributed computing effective...

10.1109/ipdpsw.2014.185 article EN 2014-05-01

Nowdays, it is prevalent to train deep learning (DL) models in cloud-native platforms that actively leverage containerization and orchestration technologies for high elasticity, low flexible operation cost, many other benefits. However, also faces new challenges our work focusing on those related I/O throughput training, including complex data access with complicated performance tuning, lack of cache capacity specialized hardware match its dynamic requirement, inefficient resource sharing...

10.1109/icde53745.2022.00209 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2022-05-01

Deep learning (DL) is becoming increasingly popular in many domains, including computer vision, speech recognition, self-driving automobiles, etc. GPU can train DL models efficiently but expensive, which motivates users to share resource reduce money costs practice. To ensure efficient sharing among multiple users, it necessary develop management and scheduling solutions. However, existing ones have several shortcomings. First, they require the specify job requirement usually quite...

10.1109/tpds.2021.3138825 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2021-01-01

Recent studies have shown that deep neural networks-based recommender systems are vulnerable to adversarial attacks, where attackers can inject carefully crafted fake user profiles (i.e., a set of items users interacted with) into target system achieve malicious purposes, such as promote or demote items. Due the security and privacy concerns, it is more practical perform attacks under black-box setting, architecture/parameters training data cannot be easily accessed by attackers. However,...

10.1145/3534678.3539359 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

Graph collaborative filtering has achieved great success in capturing users' preferences over items. Despite effectiveness, graph neural network (GNN)-based methods suffer from data sparsity real scenarios. Recently, contrastive learning (CL) been used to address the problem of sparsity. However, most CL-based only leverage original user-item interaction construct CL task, lacking explicit exploitation higher-order information (i.e., user-user and item-item relationships). Even for method...

10.1145/3539618.3591632 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

Learning on temporal graphs has attracted tremendous research interest due to its wide range of applications. Some works intuitively merge graph neural networks (GNNs) and recurrent (RNNs) capture structural information, recent propose aggregate information from neighbor nodes in local subgraphs based message passing or random walks. These methods produce node embeddings a global perspective ignore the complementarity between them, thus facing limitations capturing complex entangled dynamic...

10.1109/tnnls.2025.3526944 article EN IEEE Transactions on Neural Networks and Learning Systems 2025-01-01

Session-based recommendation aims to predict next click action (e.g., item) of anonymous users based on a fixed number previous actions. Recently, Graph Neural Networks (GNNs) have shown superior performance in various applications. Inspired by the success GNNs, tremendous endeavors been devoted introduce GNNs into session-based and achieved significant results. Nevertheless, due highly diverse types potential information sessions, existing GNNs-based methods perform differently different...

10.1145/3477495.3531940 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2022-07-06

Pleurotus citrinopileatus , a golden oyster mushroom, is popular in Asia and has pharmacological functions. However, the effects of polysaccharide-peptides extracted from underlying mechanism on digestive systme have not yet been clarified. Here, we determined composition two (PSI PSII) P. investigated protective hepatoprotective gut microbiota. The results showed that PSI PSII were made up similar monosaccharide moieties, except for varying ratios. Furthermore, they significantly increased...

10.3389/fcimb.2022.892049 article EN cc-by Frontiers in Cellular and Infection Microbiology 2022-05-20

As the study of graph neural networks becomes more intensive and comprehensive, their robustness security have received great research interest. The existing global attack methods treat all nodes in as targets. Although achieved excellent results, there is still considerable space for improvement. key problem that current approaches rigidly follow definition attacks. They ignore an important issue, i.e., different are not equally resilient to From a attacker's view, we should arrange budget...

10.1109/tkde.2024.3364972 article EN IEEE Transactions on Knowledge and Data Engineering 2024-02-14

Artificial neural networks (ANNs) have been proved to be successfully used in a variety of pattern recognition and data mining applications. However, training ANNs on large scale datasets are both data-intensive computation-intensive. Therefore, with reservation for their time-consuming get high precision. In this paper, we present cNeural, customized parallel computing platform accelerate the backpropagation algorithm. Unlike many existing network systems working thousands samples, cNeural...

10.1109/bigdata.2013.6691598 article EN 2013-10-01

In the era of big data, volume semantic data grows rapidly. The large scale contains a lot significant but often implicit information that needs to be derived by reasoning. reasoning is challenging process. On one hand, traditional single-node systems can hardly cope with such amount due resource limitations. other existing are not very efficient and scalable complexity this paper, we propose Cichlid, an distributed engine for widely-used RDFS OWL Horst rule sets. Cichlid built on top Spark....

10.1109/ipdps.2015.14 article EN 2015-05-01

Given a small pattern graph and large data graph, the task of subgraph enumeration is to find all subgraphs that are isomorphic graph. The state-of-the-art distributed algorithms like SEED CBF turn into multi-way join problem. They inefficient in communication as they have shuffle partial matching results much larger than itself during join. also spend non-trivial costs on constructing indexes for graphs. Different from those join-based algorithms, we develop new backtracking-based framework...

10.1109/icde.2019.00021 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2019-04-01

Hadoop MapReduce is a widely used parallel computing framework for solving data-intensive problems. To be able to process large-scale datasets, the fundamental design of standard places more emphasis on high-throughput data than job execution performance. This causes performance limitation when we use execute short jobs that requires quick responses. In order speed up jobs, this paper proposes optimization methods improve jobs. We made three major optimizations: first, reduce time cost...

10.1109/cgc.2012.40 article EN 2012-11-01

10.1007/s13042-014-0292-7 article EN International Journal of Machine Learning and Cybernetics 2014-08-13

As a new area of machine learning research, the deep algorithm has attracted lot attention from research community. It may bring human beings to higher cognitive level data. Its unsupervised pre-training step allows us find high-dimensional representations or abstract features which work much better than principal component analysis (PCA) method. However, it will face problems when being applied deal with large scale data due its intensive computation many levels training process against The...

10.1109/ipdpsw.2014.194 article EN 2014-05-01

This paper proposes a stock market prediction method exploiting sentiment analysis using financial microblogs (Sina Weibo). We analyze the microblog texts to find sentiments, then combine sentiments and historical data of Shanghai Composite Index (SH000001) predict movements. Our framework includes three modules: Microblog Filter (MF), Sentiment Analysis (SA), Stock Prediction (SP). The MF module is based on LDA get microblogs. SA first sets up lexicon, gets obtained from module. SP...

10.1109/ijcnn.2016.7727786 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2016-07-01

Matrix multiplication is a dominant but very time-consuming operation in many big data analytic applications. Thus its performance optimization an important and fundamental research issue. The of large-scale matrix on distributed data-parallel platforms determined by both computation IO costs. For existing execution strategies, when the concurrency scales up above threshold, their deteriorates quickly because increase cost outweighs decrease cost. This paper presents novel parallel strategy...

10.1109/tpds.2017.2686384 article EN IEEE Transactions on Parallel and Distributed Systems 2017-03-23

Matrix computation is the core of many massive data-intensive analytical applications such mining social networks, recommendation systems and nature language processing. Due to importance matrix computation, it has been widely studied for years. In Big Data ear, as scale grows, traditional single-node can hardly cope with large data computation. Existing distributed solutions are still not efficient enough, or have poor fault tolerance usability. this paper, we propose Marlin, an library...

10.1109/bigdata.2015.7364023 article EN 2021 IEEE International Conference on Big Data (Big Data) 2015-10-01
Coming Soon ...