Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

Chronicles-OCR

A Cross-Temporal Perception Benchmark for the Evolutionary Trajectory of Chinese Characters

中文版PaperGitHubHuggingFaceModelScope

Overview

Chronicles-OCR is the first comprehensive benchmark specifically designed to evaluate the cross-temporal visual perception capabilities of VLLMs across the complete evolutionary trajectory of Chinese characters — the "Seven Chinese Scripts".

Curated in collaboration with top-tier institutional domain experts (the Key Laboratory of Oracle Bone Inscription Information Processing at Anyang Normal University and the Palace Museum), the dataset comprises 2,800 strictly balanced images encompassing highly diverse physical media, ranging from tortoise shells to paper-based calligraphy.

Chronicles-OCR Overview

The Seven Chinese Scripts

The "Seven Chinese Scripts" (汉字七体) refer to the seven canonical script forms that emerged throughout the evolution of Chinese characters over more than 5,000 years:

  1. Oracle Bone Script (甲骨文) — The earliest known mature Chinese writing system, carved on tortoise shells and animal bones during the Shang Dynasty. Characters feature strong pictographic qualities with thin, angular strokes and unstandardized layouts.
  2. Bronze Script (金文) — Cast on ceremonial bronze vessels during the Shang and Zhou Dynasties. Strokes are thicker and rounder, with progressively more regularized and aesthetic structures.
  3. Seal Script (篆书) — Standardized after the Qin unification of China. Features pronounced curvilinear symmetry and fixed structural patterns, marking the transition from regional variants to a unified writing system.
  4. Clerical Script (隶书) — Emerged during the Qin–Han transition, flattening characters and replacing curves with angular strokes. Represents a critical turning point — the watershed between ancient and modern Chinese characters.
  5. Regular Script (楷书) — Established in the late Han and Wei-Jin periods with strict square structures and standardized strokes. Remains the dominant formal script to this day.
  6. Cursive Script (草书) — Developed for rapid, informal writing. Uses continuous, connected strokes that often eliminate independent character boundaries, ranging from the restrained Zhang Cao to the unbounded Kuang Cao.
  7. Running Script (行书) — A fluid yet legible intermediate style between Regular and Cursive scripts, widely used from the Eastern Han Dynasty onward. Wang Xizhi's Preface to the Orchid Pavilion is its most celebrated exemplar.

Among these, the first five (Oracle Bone → Regular) successively served as formal writing systems in their respective eras, while Cursive and Running scripts developed primarily as auxiliary styles for informal and rapid writing.

Benchmark Statistics

ItemDetails
Total Images2,800 (400 per script × 7 scripts)
Script CoverageAll Seven Chinese Scripts
AnnotationStage-Adaptive: character-level for archaic, paragraph-level for mature scripts
Expert PartnersAnyang Normal University (Oracle Bone), Palace Museum (Clerical–Cursive)
Tasks4 evaluation tasks

Evaluation Tasks

TaskShort NameScopeMetric
Cross-period Character SpottingSpottingOracle Bone, Bronze, SealF1 @ IoU > 0.75
Fine-grained Archaic Character RecognitionRecognitionOracle Bone, Bronze, SealExact-match Accuracy
Ancient Text ParsingParsingAll Seven Scripts1 − NED (Levenshtein)
Script ClassificationClassificationAll Seven ScriptsAccuracy

🏆 Leaderboard

Archaic Scripts (Oracle Bone, Bronze, Seal)

ModelThinkAvg Spot.Avg Fine.Avg Pars.Avg Class.OB Spot.OB Fine.OB Pars.OB Class.Br Spot.Br Fine.Br Pars.Br Class.Se Spot.Se Fine.Se Pars.Se Class.
Open-Source Models
InternVL3.5-8B0.16.00.0756.70.01.10.0186.20.02.20.037.00.214.50.1777.0
InternVL3.5-A28B0.515.70.1379.00.02.50.0296.30.47.80.0879.21.036.80.2961.5
Qwen2.5-VL-7B0.07.40.0771.80.04.00.0293.80.04.50.0422.50.013.80.1499.2
Qwen2.5-VL-72B0.00.00.0774.20.00.00.0198.00.00.00.0426.00.00.00.1698.5
Qwen3-VL-2B2.110.70.1273.00.01.40.0096.60.86.80.0636.55.724.00.3185.8
Qwen3-VL-8B3.417.30.1873.70.23.40.0198.62.511.00.1024.07.537.50.4298.5
Qwen3-VL-8B1.09.10.0967.30.03.70.0397.70.27.00.0531.82.816.80.2072.5
Qwen3-VL-A22B7.817.50.1991.80.35.40.0199.26.512.20.1280.216.635.00.4396.0
Qwen3-VL-A22B2.113.60.1787.30.14.20.0398.00.910.20.1166.85.326.20.3797.2
Qwen3.5-A3B5.616.20.2076.50.25.10.0299.75.311.50.1230.011.232.00.4599.8
Qwen3.5-A17B9.722.60.2288.30.59.10.0299.79.217.50.1367.219.441.30.5098.0
Gemma 4 31B it2.37.00.0470.00.03.10.0172.61.06.50.0374.86.011.20.1062.7
MiniCPM-V 4.50.04.80.0273.80.02.50.0195.20.05.50.0318.00.19.00.0482.5
Molmo 7B-D 09240.00.10.0024.20.00.00.0140.80.00.20.000.00.00.00.0020.5
Molmo 72B 09240.00.30.0034.70.00.50.0028.00.00.50.000.80.00.00.0082.0
Ovis2.6-30B-A3B1.99.00.0968.30.12.00.0189.80.77.50.0613.56.824.50.2579.0
GLM-4.5V 108B1.46.10.0576.80.14.20.031002.06.50.0515.53.39.20.1091.5
Kimi K2.55.027.10.2296.40.111.50.051007.525.80.1990.012.558.50.6095.5
Kimi K2.51.820.30.2294.70.010.20.0599.81.217.50.2085.86.044.80.5793.5
Proprietary Models
GPT-4o0.11.50.0282.00.00.50.0196.50.01.00.0246.80.34.50.0689.0
GPT-50.43.70.0488.10.04.00.0098.20.04.00.0460.51.64.50.1297.5
Seed 1.89.220.60.1694.70.49.20.0399.59.415.80.1780.526.745.00.4299.0
Seed 1.87.417.10.1796.70.48.80.0499.55.814.80.1890.023.336.20.4397.5
Seed 2.0 Pro16.524.50.1895.93.011.00.0399.519.930.80.2292.240.741.50.4393.8
Seed 2.0 Pro15.323.30.2196.62.411.20.0499.817.826.00.2692.239.137.50.4994.5
MiMo-V2-Omni0.48.60.0887.70.06.50.0499.50.28.00.0758.51.59.80.1593.0
Gemini 2.5 Pro0.87.50.0787.50.05.80.0499.50.27.00.0680.52.810.80.1470.2
Gemini 3.1 Pro2.619.50.1593.80.014.00.0599.52.522.50.1884.57.832.20.3293.2
Claude Opus 4.70.410.00.0890.40.04.80.0393.80.19.50.0580.51.421.50.2193.8

OB = Oracle Bone, Br = Bronze, Se = Seal. Bold = best, scores are H-mean (Spot.), Accuracy (Fine./Class.), NED (Pars.).

Mature Scripts (Clerical, Regular, Running, Cursive)

ModelThinkAvg Pars.Avg Class.Cl Pars.Cl Class.Re Pars.Re Class.Ru Pars.Ru Class.Cu Pars.Cu Class.
Open-Source Models
InternVL3.5-8B0.4035.60.411.80.5169.40.3852.90.3035.0
InternVL3.5-A28B0.5658.10.5428.50.6985.50.5663.30.4675.2
Qwen2.5-VL-7B0.4434.80.548.00.6217.00.4236.40.2190.5
Qwen2.5-VL-72B0.4957.20.5918.00.6691.50.4656.60.2686.0
Qwen3-VL-2B0.5735.20.615.50.7111.80.5037.90.4293.0
Qwen3-VL-8B0.6660.90.6932.50.7797.20.6459.10.5681.0
Qwen3-VL-8B0.4945.90.5211.20.6479.70.5153.40.3256.2
Qwen3-VL-A22B0.6664.90.6936.50.7395.50.6668.30.5982.0
Qwen3-VL-A22B0.6560.40.6731.00.7593.50.6562.30.5478.0
Qwen3.5-A3B0.7168.10.7936.80.8184.20.6875.60.5784.2
Qwen3.5-A17B0.7372.20.8152.00.8181.30.6775.30.6689.4
Gemma 4 31B it0.3457.10.379.60.5681.90.3365.00.0984.5
MiniCPM-V 4.50.4044.90.452.80.6187.50.3856.90.1548.8
Molmo 7B-D 09240.0116.90.0170.80.013.00.010.70.010.5
Molmo 72B 09240.009.10.006.80.0116.50.013.20.0012.8
Ovis2.6-30B-A3B0.5339.70.548.50.6377.90.5771.60.4212.2
GLM-4.5V 108B0.4456.60.4511.50.6184.50.4463.30.2381.5
Kimi K2.50.7177.00.7370.20.7878.20.7277.80.6686.0
Kimi K2.50.7072.30.7568.50.7881.70.6065.30.6684.8
Proprietary Models
GPT-4o0.3055.90.3520.50.4783.00.2455.60.1280.5
GPT-50.3862.10.5036.20.5759.60.2178.10.1871.0
Seed 1.80.6969.60.6845.50.7992.70.6971.80.6182.5
Seed 1.80.6771.10.6948.00.7889.20.5773.30.6080.8
Seed 2.0 Pro0.7276.10.7560.80.8182.00.7377.60.6292.2
Seed 2.0 Pro0.7175.30.7661.80.8082.00.6574.30.6689.0
MiMo-V2-Omni0.5662.30.6240.00.7180.70.5873.30.3664.2
Gemini 2.5 Pro0.5356.30.6733.20.7239.60.4959.40.2395.0
Gemini 3.1 Pro0.7073.10.8061.00.8362.70.6671.10.5295.8
Claude Opus 4.70.5066.80.5350.20.6374.40.4456.60.3886.0

Cl = Clerical, Re = Regular, Ru = Running, Cu = Cursive. Bold = best.

Getting Started

1. Setup

git clone https://github.com/VirtualLUOUCAS/Chronicles-OCR.git cd Chronicles-OCR pip install -r requirements.txt

2. Download Data

Download and place the benchmark data under data/:

data/
├── Chronicles_OCR.jsonl
└── images/
    ├── 甲骨文/    # Oracle Bone
    ├── 金文/      # Bronze Script
    ├── 篆书/      # Seal Script
    ├── 隶书/      # Clerical Script
    ├── 楷书/      # Regular Script
    ├── 行书/      # Running Script
    └── 草书/      # Cursive Script

3. Inference

# OpenAI-compatible API python infer.py --api_type openai_compat \ --model_name Qwen2.5-VL-7B-Instruct \ --base_url http://127.0.0.1:8000/v1 \ --api_key EMPTY --max_workers 64 # Local vLLM python infer.py --api_type local_vllm \ --model_path /path/to/model \ --tensor_parallel_size 1 --max_model_len 32768

4. Judging (Rule-based)

python judge.py # all models python judge.py --models model_a # specific model

5. Summary Report

python summarize.py # → judge_results/results_analysis.xlsx

Citation

@misc{li2026chronicles, title={Chronicles-OCR: A Cross-Temporal Perception Benchmark for the Evolutionary Trajectory of Chinese Characters}, author={Gengluo Li and Shangping Peng and Xingyu Wan and Chengquan Zhang and Hao Feng and Xin Xu and Pian Wu and Bang Li and Zengmao Ding and Yongge Liu and Yipei Ye and Yang Yang and Zhan Shu and Guojun Yan and Zhe Li and Can Ma and Weiping Wang and Yu Zhou and Han Hu}, year={2026}, journal={arXiv preprint arXiv:2605.11960}, url={https://arxiv.org/abs/2605.11960}, }

Acknowledgements

We sincerely acknowledge the Key Laboratory of Oracle Bone Inscription Information Processing at Anyang Normal University and the Palace Museum for their invaluable contributions to data sourcing and expert annotation.

License

This benchmark is released for research purposes only.

关于 About

Chronicles-OCR: A Cross-Temporal Perception Benchmark for the Evolutionary Trajectory of Chinese Characters (Seven Chinese Scripts, 2800 images)

语言 Languages

Python100.0%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
4
Total Commits
峰值: 4次/周
Less
More