NFDI4DS | UHH-SEMS - Publication Details

Universal Representation for Code

Python Code (set theory) Universal code Discriminative model Control flow graph Representation

DOI: 10.48550/arxiv.2103.03116 Publication Date: 2021-01-01

Abstract Supplemental Material References Cited by

AUTHORS (4)

Linfeng Liu

Hoan Anh Nguyen

George Karypis

Srinivasan H. Sen...

ABSTRACT

Learning from source code usually requires a large amount of labeled data. Despite the possible scarcity data, trained model is highly task-specific and lacks transferability to different tasks. In this work, we present effective pre-training strategies on top novel graph-based representation, produce universal representations for code. Specifically, our representation captures important semantics between elements (e.g., control flow data flow). We pre-train graph neural networks extract properties. The pre-trained then enables possibility fine-tuning support various downstream applications. evaluate two real-world datasets -- spanning over 30M Java methods 770K Python methods. Through visualization, reveal discriminative properties in representation. By comparing multiple benchmarks, demonstrate that proposed framework achieves state-of-the-art results method name prediction link prediction.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Universal Representation for Code

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....