- Digital Marketing and Social Media
- Natural Language Processing Techniques
- Algorithms and Data Compression
- Web Data Mining and Analysis
- Power Systems and Technologies
- Recommender Systems and Techniques
Microsoft Research Asia (China)
2022-2023
Deploying pre-trained transformer models like BERT on downstream tasks in resource-constrained scenarios is challenging due to their high inference cost, which grows rapidly with input sequence length. In this work, we propose a constraint-aware and ranking-distilled token pruning method ToP, selectively removes unnecessary tokens as passes through layers, allowing the model improve online speed while preserving accuracy. ToP overcomes limitation of inaccurate importance ranking conventional...
In sponsored search engines, pre-trained language models have shown promising performance improvements on Click-Through-Rate (CTR) prediction. A widely used approach for utilizing in CTR prediction consists of fine-tuning the with click labels and early stopping peak value obtained Area Under ROC Curve (AUC). Thereafter output these fine-tuned models, i.e., final score or intermediate embedding generated by model, is as a new Natural Language Processing (NLP) feature into baseline. This...