Re: [clreq] 扩充4.1节的内容 (#636) from Fuqiao Xue via GitHub on 2025-05-22 (public-i18n-archive@w3.org from April to June 2025)

From: Fuqiao Xue via GitHub <sysbot+gh@w3.org>
Date: Thu, 22 May 2025 02:18:19 +0000
To: public-i18n-archive@w3.org
Message-ID: <issue_comment.created-2899710629-1747880296-sysbot+gh@w3.org>

Unicode标准为全世界的书写系统提供了统一的编码方案，但汉字作为世界上最复杂的文种之一，在Unicode中面临着诸多独特的技术挑战。

Unicode标准中的CJK（中日韩）统一汉字是中文Unicode问题的核心。为了节约码位空间，Unicode将中文、日文、韩文中形状相似的汉字统一编码，这种设计被称为“汉字统一”（Han Unification）。

汉字统一虽然节约了编码空间，但同一个Unicode码位的汉字在不同语言中可能有细微的字形差异。例如，U+8FD4（返）字在中文简体、繁体、日文、韩文中的写法略有不同。

（此处可加图？）

## 中文字符的排序机制

中文字符排序主要有以下几种方案：

### 拼音排序

拼音排序是最常用的中文排序方法，但实现复杂：

- **多音字处理**：如"行"字有"xíng"和"háng"两个读音，排序时需要根据上下文确定使用哪个读音
- **方言差异**：不同地区的拼音可能存在差异

### 笔画排序

（需要提吗？）

### Unicode排序算法（UCA）

（需要提吗？）

## 变体选择符

（暂时没想好怎么写）

## 汉字编码的历史遗留问题

### 传统编码系统的影响

（GB系列编码）

（Big5编码）


-- 
GitHub Notification of comment by xfq
Please view or discuss this issue at https://github.com/w3c/clreq/issues/636#issuecomment-2899710629 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 22 May 2025 02:18:19 UTC