- From: r12a <ishida@w3.org>
- Date: Thu, 27 Apr 2023 10:38:31 +0100
- To: Internationalization Working Group <public-i18n-core@w3.org>
- Message-ID: <91bbf8dc-6c9f-070b-0596-4cd77cb2fd15@w3.org>
This lists terms that are defined in both our glossary and that of
INFRA. (For action 1251)
ASCII case-insensitive matching
link to infra but no embedded definition - no clash, but ours is
more explanatory
Code point
no link
i18n:
Code point. A code point value represents the position of a character in
a coded character set. For example, the code point for the letter á in
the Unicode coded character set is 225 in decimal, or 0xE1 in
hexadecimal notation. Hexadecimal notation is commonly used for
referring to code points. See also Unicode code point
<https://w3c.github.io/i18n-glossary/#dfn-unicode-code-point>.
INFRA:
A code pointis a Unicode code point and is represented as "U+" followed
by four-to-six ASCII upper hex digits
<https://infra.spec.whatwg.org/#ascii-upper-hex-digit>, in the range
U+0000 to U+10FFFF, inclusive. A code point
<https://infra.spec.whatwg.org/#code-point>’s valueis its underlying
number.
A code point <https://infra.spec.whatwg.org/#code-point> may be followed
by its name, by its rendered form between parentheses when it is not
U+0028 or U+0029, or by both. Documents using the Infra Standard are
encouraged to follow code points
<https://infra.spec.whatwg.org/#code-point> by their name when they
cannot be rendered or are U+0028 or U+0029; otherwise, follow them by
their rendered form between parentheses, for legibility.
A code point <https://infra.spec.whatwg.org/#code-point>’s name is
defined in Unicode and represented in ASCII uppercase
<https://infra.spec.whatwg.org/#ascii-uppercase>. [UNICODE]
<https://infra.spec.whatwg.org/#biblio-unicode>
Code unit.
no link
i18n:
Code unit. The units of data used by a character encoding
<https://w3c.github.io/i18n-glossary/#dfn-character-encoding> to encode
or serialize characters into a programming language or other serialized
form (such as a file). Common code units are 8-, 16-, and 32-bits in
size. On the Web we are mostly concerned with /bytes/, which are
technically "8-bit code units". However, in Javascript a |char| is a
16-bit code unit (related to the UTF-16 encoding of Unicode)
INFRA:
A stringis a sequence of unsigned 16-bit integers, also known as code
units. A string <https://infra.spec.whatwg.org/#string> is also known as
a JavaScript string <https://infra.spec.whatwg.org/#string>. Strings
<https://infra.spec.whatwg.org/#string> are denoted by double quotes and
monospace font.
*/Scalar value/*
no link
i18n:*//*
*//*/Scalar value/, see Unicode scalar value
<https://w3c.github.io/i18n-glossary/#dfn-scalar-value>.
INFRA:
A scalar valueis a code point
<https://infra.spec.whatwg.org/#code-point> that is not a surrogate
<https://infra.spec.whatwg.org/#surrogate>.
Surrogate code point
link
i18n:
Surrogate code point. Unicode definition
<https://www.unicode.org/glossary/#surrogate_code_point>: "A Unicode
code point in the range U+D800..U+DFFF. Reserved for use by UTF-16,
where a pair of surrogate code units (a high surrogate followed by a low
surrogate) “stand in” for a supplementary code point
<https://w3c.github.io/i18n-glossary/#dfn-supplementary-code-point>."
This term is also defined by [INFRA
<https://w3c.github.io/i18n-glossary/#bib-infra>].
INFRA:
A surrogateis a leading surrogate
<https://infra.spec.whatwg.org/#leading-surrogate> or a trailing
surrogate <https://infra.spec.whatwg.org/#trailing-surrogate>.
Received on Thursday, 27 April 2023 09:38:36 UTC