Tongxuan Liu

ORCID: 0009-0007-2634-2788
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Recommender Systems and Techniques
  • Advanced Image and Video Retrieval Techniques
  • Advanced Neural Network Applications
  • Topic Modeling
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Computational Techniques and Applications
  • Natural Language Processing Techniques
  • Stochastic Gradient Optimization Techniques
  • Machine Learning in Healthcare
  • Rough Sets and Fuzzy Logic
  • Image Retrieval and Classification Techniques
  • Data Mining Algorithms and Applications

University of Science and Technology of China
2024

Alibaba Group (China)
2021

We present FleetRec, a high-performance and scalable recommendation inference system within tight latency constraints. FleetRec takes advantage of heterogeneous hardware including GPUs the latest FPGAs equipped with high-bandwidth memory. By disaggregating computation memory to different types bridging their connections by high-speed network, gains best both worlds, can naturally scale out adding nodes cluster. Experiments on three production models up 114 GB show that outperforms optimized...

10.1145/3447548.3467139 article EN 2021-08-12

The popularity of recommendation models and the enhanced AI processing capability CPUs have provided massive performance opportunities to deliver satisfactory experiences a large number users. Unfortunately, existing model training methods fail achieve high efficiency due unique challenges such as dynamic shape parallelism. To address above limitations, we comprehensively study distinctive characteristics discover several unexploited optimization opportunities. exploit opportunities, propose...

10.1109/tpds.2024.3381186 article EN IEEE Transactions on Parallel and Distributed Systems 2024-03-25

Deep neural networks are widely used in personalized recommendation systems. Unlike regular DNN inference workloads, is memory-bound due to the many random memory accesses needed lookup embedding tables. The also heavily constrained terms of latency because producing a for user must be done about tens milliseconds. In this paper, we propose MicroRec, high-performance engine MicroRec accelerates by (1) redesigning data structures involved embeddings reduce number lookups and (2) taking...

10.48550/arxiv.2010.05894 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Deep neural networks are widely used in personalized recommendation systems. Unlike regular DNN inference workloads, is memory-bound due to the many random memory accesses needed lookup embedding tables. The also heavily constrained terms of latency because producing a for user must be done about tens milliseconds. In this paper, we propose MicroRec, high-performance engine MicroRec accelerates by (1) redesigning data structures involved embeddings reduce number lookups and (2) taking...

10.3929/ethz-b-000470540 article EN 2021-03-15

Large language models (LLM) have recently attracted surging interest due to their outstanding capabilities across various domains. However, enabling efficient LLM inference is challenging its autoregressive decoding that generates tokens only one at a time. Although research works apply pruning or quantization speed up inference, they typically require fine-tuning the LLM, incurring significant time and economic costs. Meanwhile, speculative has been proposed use small (SSMs) accelerate of...

10.48550/arxiv.2402.15678 preprint EN arXiv (Cornell University) 2024-02-23

In the process of decision tree construction, property division standards directly affect classification results. Aimed at weakness ID3 in nicety grading, we provide concept degree rough as select criteria separation property. The method took into account grading and dependency between condition attributes attributes. Compared with traditional based entropy, experiment proved that constructed our effectively improves

10.1109/aici.2010.246 article EN 2010-10-01
Coming Soon ...