An open dataset for the evolution of oracle bone characters: EVOBC
Glyph (data visualization)
Decipherment
DOI:
10.48550/arxiv.2401.12467
Publication Date:
2024-01-01
AUTHORS (9)
ABSTRACT
The earliest extant Chinese characters originate from oracle bone inscriptions, which are closely related to other East Asian languages. These inscriptions hold immense value for anthropology and archaeology. However, deciphering script remains a formidable challenge, with only approximately 1,600 of the over 4,500 elucidated date. Further scholarly investigation is required comprehensively understand this ancient writing system. Artificial Intelligence technology promising avenue characters, particularly concerning their evolution. one challenges lack datasets mapping evolution these time. In study, we systematically collected authoritative texts websites spanning six historical stages: Oracle Bone Characters - OBC (15th century B.C.), Bronze Inscriptions BI (13th 221 Seal Script SS (11th 8th centuries Spring Autumn period SAC (770 476 Warring States WSC (475 B.C. Clerical CS (221 220 A.D.). Subsequently, constructed an extensive dataset, namely EVolution (EVOBC), consisting 229,170 images representing 13,714 distinct character categories. We conducted validation simulated on results demonstrate its high efficacy in aiding study script. This openly accessible dataset aims digitalize scripts across multiple eras, facilitating decipherment by examining glyph forms.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....