About
Contact & Profiles
Research Areas
- Advanced Image and Video Retrieval Techniques
- Ionic liquids properties and applications
- Advanced Vision and Imaging
- Topic Modeling
- Phase Equilibria and Thermodynamics
- Natural Language Processing Techniques
- Ferroelectric and Negative Capacitance Devices
- Chemical Thermodynamics and Molecular Structure
- Machine Learning and Algorithms
- Robotics and Sensor-Based Localization
- Sparse and Compressive Sensing Techniques
- Speech Recognition and Synthesis
- Thermodynamic properties of mixtures
Megvii (China)
2022
Hebei University of Science and Technology
2009-2011
10.18653/v1/2024.emnlp-main.197
article
EN
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
2024-01-01
10.1016/j.theochem.2009.08.028
article
EN
Journal of Molecular Structure THEOCHEM
2009-09-01
10.1016/j.molliq.2011.10.009
article
EN
Journal of Molecular Liquids
2011-10-28
Large Language Models (LLMs) from the GPT family have become extremely popular, leading to a race towards reducing their inference costs allow for efficient local computation. Yet, vast majority of existing work focuses on weight-only quantization, which can reduce runtime in memory-bound one-token-at-a-time generative setting, but does not address them compute-bound scenarios, such as batched or prompt processing. In this paper, we general quantization problem, where both weights and...
10.48550/arxiv.2310.09259
preprint
EN
other-oa
arXiv (Cornell University)
2023-01-01
Coming Soon ...