Modeling low- and high-order feature interactions with FM and self-attention network

Feature (linguistics) Feature Engineering Feature vector
DOI: 10.1007/s10489-020-01951-6 Publication Date: 2020-11-09T01:04:45Z
ABSTRACT
Click-Through Rate (CTR) prediction has always been a very popular topic. In many online applications, such as online advertising and product recommendation, a small increase in CTR will bring great returns. However, CTR prediction has always faced several challenges. A large number of users and items and the different sizes of the feature space of different data types lead to high-dimensional and sparse input, and high-order feature interactions rely too much on expert knowledge and are very time-consuming. In this paper, we build a novel model called multi-order interactive features aware factorization machine (MoFM) for CTR prediction. To effectively capturing both low-order and high-order interactive features, three different types of prediction models are integrated, of which logistic regression (LR) and factorization machine (FM) model the original features and 2-order interactive features respectively, and a multi-head self-attention network with residual connections is used to automatically identify high-value high-order feature combinations. There is also an embedding layer in the model to realize a unified embedding processing of different data types, avoiding diversification, sparsity, and high dimensionality of features. Since, feature engineering is not required, we can carry out end-to-end model learning. Experiments on three public datasets show the superiority of the proposed model over the state-of-the-art models, and the flexibility and scalability of the model structure have also been verified.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (26)
CITATIONS (21)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....