NFDI4DS | UHH-SEMS - Publication Details

Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Computation and Language (cs.CL) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.1912.11637 Publication Date: 2019-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Guangxiang Zhao

Junyang Lin

Zhiyuan Zhang

Xuancheng Ren

Qi Su

Xu Sun

ABSTRACT

Self-attention based Transformer has demonstrated the state-of-the-art performances in a number of natural language processing tasks. is able to model long-term dependencies, but it may suffer from extraction irrelevant information context. To tackle problem, we propose novel called \textbf{Explicit Sparse Transformer}. Explicit improve concentration attention on global context through an explicit selection most relevant segments. Extensive experimental results series and computer vision tasks, including neural machine translation, image captioning, modeling, all demonstrate advantages performance. We also show that our proposed sparse method achieves comparable or better than previous method, significantly reduces training testing time. For example, inference speed twice sparsemax model. Code will be available at \url{https://github.com/lancopku/Explicit-Sparse-Transformer}

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....