<character> property datatype

A number of properties are typed as having <character> values: "character",
"grouping-separator", and "hyphenation-character".

<character> is described as being a single Unicode character, in Section
5.11.

However, the property description for fo:character embellishes this rather
terse description, and says that a <character> specifies "the code point of
the Unicode character to be presented". To me this pretty clearly means a
specification of form U+xxxx.

With the other 2 properties this distinction is not made; we are left with
the idea that a Unicode character, as opposed to a codepoint (or code value;
the integer in other words), will be used. That is, if someone wished to use
a 3-octet UTF-8 encoded value that would seemingly be OK.

"grouping-separator" is defined wrt XSLT, where it is a single instance of
the XML 'Char' production, that is, a Unicode character, either UTF-8 or
UTF-16 encoded (at a  minimum), or specified as #x9 | #xA | #xD |
[#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF].

So our (myself and Eric Bischoff) question is, what have other implementors
elected to use?

Thanks for any input.

Regards,
AHS

______________________________
Arved Sandstrom
Sr Software Developer
Platform Products Group
Halifax R&D Office
Hummingbird Ltd

Received on Monday, 29 July 2002 18:15:34 UTC