[i18n-activity] Request for additional named entities for invisible/ambiguous characters (#1841)

r12a has just created a new issue for https://github.com/w3c/i18n-activity:

== Request for additional named entities for invisible/ambiguous characters ==
This is a request from the W3C i18n WG that the HTML Standard define character entities to cover key invisible/ambiguous Unicode characters.

The following is a list of candidates that we are proposing, including (for convenience) a list of already existing named entities. The latter are marked **in bold**.  Entity names are proposed after the character name, and are based on abbreviations used by Unicode, where they exist.  Lower priority items are italicised.


### Latin 1 Supplement — Latin-1 punctuation and symbols
- **U+00A0 NO-BREAK SPACE**  ` `
- **U+00AD SOFT HYPHEN**  `­`

### Combining Diacritical Marks — Grapheme joiner
- U+034F COMBINING GRAPHEME JOINER `&cgj;`

### Arabic — Format character
- U+061C ARABIC LETTER MARK `&alm;`

### Hangul Jamo — Old initial consonants
- _U+115F HANGUL CHOSEONG FILLER_ `&hcf;`

### Hangul Jamo — Medial vowels
- _U+1160 HANGUL JUNGSEONG FILLER_ `&hjf;`

### Ogham — Space
- U+1680 OGHAM SPACE MARK `&osm;`

### Mongolian — Format controls
- U+180B MONGOLIAN FREE VARIATION SELECTOR ONE `&fvs1;`
- U+180C MONGOLIAN FREE VARIATION SELECTOR TWO `&fvs2;`
- U+180D MONGOLIAN FREE VARIATION SELECTOR THREE`&fvs3;`
- U+180E MONGOLIAN VOWEL SEPARATOR `&mvs;`
- U+180F MONGOLIAN FREE VARIATION SELECTOR FOUR `&fvs4;`

### General Punctuation — Spaces
- U+2000 EN QUAD `&nqsp;`
- U+2001 EM QUAD `&mqsp;`
- **U+2002 EN SPACE**  ` `
- **U+2003 EM SPACE**  ` `
- **U+2004 THREE-PER-EM SPACE**  ` `
- **U+2005 FOUR-PER-EM SPACE**  ` `
- U+2006 SIX-PER-EM SPACE `&6msp;`
- **U+2007 FIGURE SPACE**  ` `
- **U+2008 PUNCTUATION SPACE**  ` `
- **U+2009 THIN SPACE**  ` `   AND   ` `
- **U+200A HAIR SPACE**  ` ` AND ` `  AND part of `   `(U+0205F U+200A)

### General Punctuation — Format character
- **U+200B ZERO WIDTH SPACE**  `​` AND `​`  AND `​`  AND `​`  AND  `​`
- **U+200C ZERO WIDTH NON-JOINER**  `‌`
- **U+200D ZERO WIDTH JOINER** `‍`
- **U+200E LEFT-TO-RIGHT MARK** `‎`
- **U+200F RIGHT-TO-LEFT MARK**  `‏`
- U+2066 LEFT-TO-RIGHT ISOLATE `&lri;`
- U+2067 RIGHT-TO-LEFT ISOLATE `&rli;`
- U+2068 FIRST STRONG ISOLATE `&fsi;`
- U+2069 POP DIRECTIONAL ISOLATE `&pdi;`
- U+202D LEFT-TO-RIGHT OVERRIDE `&lro;`
- U+202E RIGHT-TO-LEFT OVERRIDE `&rlo;`
- **U+2060 WORD JOINER**  `⁠`
- _U+202A LEFT-TO-RIGHT EMBEDDING_ `&lre;`
- _U+202B RIGHT-TO-LEFT EMBEDDING_ `&rle;`
- _U+202C POP DIRECTIONAL FORMATTING_ `&pdf;`

### General Punctuation — Separators
- U+2028 LINE SEPARATOR `&lsep;`
- U+2029 PARAGRAPH SEPARATOR `&psep;`

### General Punctuation — Space
- U+202F NARROW NO-BREAK SPACE `&nnbsp;`
- **U+205F MEDIUM MATHEMATICAL SPACE**  `  ` AND part of  `  ` (U+205F U+200A)

### General Punctuation — Invisible operators
- **U+2061 FUNCTION APPLICATION**  `⁡`  AND  `⁡`
- **U+2062 INVISIBLE TIMES**  ⁢ AND `⁢`
- **U+2063 INVISIBLE SEPARATOR**  `⁣ ` AND `⁣`
- U+2064 INVISIBLE PLUS `&InvisiblePlus;`
- _U+206D ACTIVATE ARABIC FORM SHAPING_ `&aafs;`

### CJK Symbols And Punctuation — CJK symbols and punctuation
- U+3000 IDEOGRAPHIC SPACE `&idsp;`

### Hangul Compatibility Jamo — Special character
- _U+3164 HANGUL FILLER_  `&hf;`

### Halfwidth And Fullwidth Forms — Halfwidth Hangul variants
- _U+FFA0 HALFWIDTH HANGUL FILLER_  `&hwhf;`

### Shorthand Format Controls — Shorthand format controls
- _U+1BCA0 SHORTHAND FORMAT LETTER OVERLAP_ 
- _U+1BCA1 SHORTHAND FORMAT CONTINUING OVERLAP_
- _U+1BCA2 SHORTHAND FORMAT DOWN STEP_
- _U+1BCA3 SHORTHAND FORMAT UP STEP_

### Musical Symbols — Beams and slurs
- _U+1D173 MUSICAL SYMBOL BEGIN BEAM_
- _U+1D174 MUSICAL SYMBOL END BEAM_
- _U+1D175 MUSICAL SYMBOL BEGIN TIE_
- _U+1D176 MUSICAL SYMBOL END TIE_
- _U+1D177 MUSICAL SYMBOL BEGIN SLUR_
- _U+1D178 MUSICAL SYMBOL END SLUR_
- _U+1D179 MUSICAL SYMBOL BEGIN PHRASE_
- _U+1D17A MUSICAL SYMBOL END PHRASE_

### Emoji Variation Selectors - turns on and off colour
- U+FE0E: VARIATION SELECTOR-15 `&vs15;`
- U+FE0F: VARIATION SELECTOR-16 `&vs16;`

We would also like to have a &zwsp; alias in addition to ​for U+200B.




## Instructions: 

This follows the process at https://w3c.github.io/i18n-activity/guidelines/review-instructions.html

1. Create the review comment you want to propose by replacing the prompts above these instructions, but **LEAVE ALL THE INSTRUCTIONS INTACT** 

2. **Add one or more t:... labels. These should use ids from specdev establish a link to that doc.**

2. Set a label to identify the spec: this starts with s: followed by the spec's short name. If you are unable to do that, ask a W3C staff contact to help.

3. Ask the i18n WG to review your comment.

4. After discussion with the i18n WG, raise an issue in the repository of the WG that owns the spec. Use the text above these instructions as the starting point for that comment, but add any suggestions that arose from the i18n WG. In the other WG's repo, add an 'i18n-needs-resolution' label to the new issue. If you think any of the participants in layout requirements task force groups would be interested in following the discussion, add also the appropriate i18n-\*lreq label(s).

5. Delete the text below that says 'url_for_the_issue_raised', then add in its place the URL for the issue you raised in the other WG's repository. Do NOT remove the initial '§ '. Do NOT use \[...](...) notation – you need to delete the placeholder, then paste the URL.

6. Remove the 'pending' label, and add a 'needs-resolution' tag to this tracker issue. 

7. If you added an \*lreq label, add the label 'spec-type-issue', add the corresponding language label, and a label to indicate the relevant typographic feature(s), eg. 'i:line_breaking'. The latter represent categories related to the Language Enablement Index, and all start with i:.

8. Edit this issue to **REMOVE ALL THE INSTRUCTIONS & THE PROPOSED COMMENT**, ie. the line below that is '---' and all the text before it to the very start of the issue.

---


**This is a tracker issue.** Only discuss things here if they are i18n WG internal meta-discussions about the issue. **Contribute to the actual discussion at the following link:**


§ url_for_the_issue_raised


Please view or discuss this issue at https://github.com/w3c/i18n-activity/issues/1841 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Thursday, 28 March 2024 12:05:16 UTC