NFDI4DS | UHH-SEMS - Publication Details

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Software Engineering (cs.SE) FOS: Computer and information sciences Computer Science - Software Engineering Computer Science - Machine Learning Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2406.11931 Publication Date: 2024-01-01

Abstract Supplemental Material References Cited by

AUTHORS (40)

DeepSeek-AI

Zhu, Qihao

Guo, Daya

Shao, Zhihong

Yang, Dejian

Wang, Peiyi

Xu, Runxin

Wu, Y.

Li, Yukun

Gao, Huazuo

Ma, Shirong

Zeng, Wangding

Bi, Xiao

Gu, Zihui

Xu, Hanwei

Dai, Damai

Dong, Kai

Zhang, Liyue

Piao, Yishi

Gou, Zhibin

Xie, Zhenda

Hao, Zhewen

Wang, Bingxuan

Song, Junxiao

Chen, Deli

Xie, Xin

Guan, Kang

You, Yuxiang

Liu, Aixin

Du, Qiushi

Gao, Wenjun

Lu, Xuan

Chen, Qinyu

Wang, Yaohui

Deng, Chengqi

Li, Jiashi

Zhao, Chenggang

Ruan, Chong

Luo, Fuli

Liang, Wenfeng

ABSTRACT

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks. Compared to DeepSeek-Coder-33B, DeepSeek-Coder-V2 demonstrates significant advancements in various aspects of code-related tasks, as well as reasoning and general capabilities. Additionally, DeepSeek-Coder-V2 expands its support for programming languages from 86 to 338, while extending the context length from 16K to 128K. In standard benchmark evaluations, DeepSeek-Coder-V2 achieves superior performance compared to closed-source models such as GPT4-Turbo, Claude 3 Opus, and Gemini 1.5 Pro in coding and math benchmarks.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....