arXiv Open Access 2025

Quantifying patterns of punctuation in modern Chinese prose

Michał Dolina Jakub Dec Stanisław Drożdż Jarosław Kwapień Jin Liu +1 lainnya
Lihat Sumber

Abstrak

Recent research shows that punctuation patterns in texts exhibit universal features across languages. Analysis of Western classical literature reveals that the distribution of spaces between punctuation marks aligns with a discrete Weibull distribution, typically used in survival analysis. By extending this analysis to Chinese literature represented here by three notable contemporary works, it is shown that Zipf's law applies to Chinese texts similarly to Western texts, where punctuation patterns also improve adherence to the law. Additionally, the distance distribution between punctuation marks in Chinese texts follows the Weibull model, though larger spacing is less frequent than in English translations. Sentence-ending punctuation, representing sentence length, diverges more from this pattern, reflecting greater flexibility in sentence length. This variability supports the formation of complex, multifractal sentence structures, particularly evident in Gao Xingjian's "Soul Mountain". These findings demonstrate that both Chinese and Western texts share universal punctuation and word distribution patterns, underscoring their broad applicability across languages.

Topik & Kata Kunci

Penulis (6)

M

Michał Dolina

J

Jakub Dec

S

Stanisław Drożdż

J

Jarosław Kwapień

J

Jin Liu

T

Tomasz Stanisz

Format Sitasi

Dolina, M., Dec, J., Drożdż, S., Kwapień, J., Liu, J., Stanisz, T. (2025). Quantifying patterns of punctuation in modern Chinese prose. https://arxiv.org/abs/2503.04449

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓