Towards Automated Classification of Code Review Feedback to Support Analytics
Code review
Code (set theory)
DOI:
10.48550/arxiv.2307.03852
Publication Date:
2023-01-01
AUTHORS (6)
ABSTRACT
Background: As improving code review (CR) effectiveness is a priority for many software development organizations, projects have deployed CR analytics platforms to identify potential improvement areas. The number of issues identified, which crucial metric measure effectiveness, can be misleading if all are placed in the same bin. Therefore, finer-grained classification identified during CRs provide actionable insights improve effectiveness. Although recent work by Fregnan et al. proposed automated models classify CR-induced changes, we noticed two areas -- i) classifying comments that do not induce changes and ii) using deep neural networks (DNN) conjunction with context performances. Aims: This study aims develop an comment classifier leverages DNN achieve more reliable performance than Method: Using manually labeled dataset 1,828 comments, trained evaluated supervised learning-based leveraging context, text, set metrics into one five high-level categories Turzo Bosu. Results: Based on our 10-fold cross-validation-based evaluations multiple combinations tokenization approaches, found model CodeBERT achieving best accuracy 59.3%. Our approach outperforms al.'s 18.7% higher accuracy. Conclusion: Besides facilitating improved analytics, useful developers prioritizing feedback selecting reviewers.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....