W3C home > Mailing lists > Public > www-tag@w3.org > January 2004

Your comments on the Character Model [C116-C120, C122, C123, C125-128, C174-179, C182, C183]

From: Richard Ishida <ishida@w3.org>
Date: Fri, 16 Jan 2004 08:21:05 -0000
To: <chris@w3.org>
Cc: <www-tag@w3.org>, <www-i18n-comments@w3.org>
Message-ID: <000401c3dc09$b082e2d0$6601a8c0@w3cishida>

Dear Chris,

Many thanks for your comments on the 2nd Last Call version of the Character
Model for the World Wide Web v1.0 [1].  We appreciate the interest you have
taken in this specification.

You can see the comments you submitted, grouped together, at 
http://www.w3.org/International/Group/2002/charmod-lc/SortByOriginator.html#
C116
(You can jump to a specific comment in the table by adding its ID to the end
of the URI.)

The following comments were accepted and edits were made along the lines you
suggested. We do not need you to comment on the edits made, but if you wish
to, please reply to us within the next two weeks at
mailto:www-i18n-comments@w3.org and copy w3c-i18n-ig@w3.org.
        C119, C122, C123, C179


PLEASE REVIEW the decisions for the following additional comments and reply
to us within the next two weeks at mailto:www-i18n-comments@w3.org (copying
w3c-i18n-ig@w3.org) to say whether you are satisfied with the decision
taken. 
        C116, C117, C118, C120,C125, C126, C127, 
        C128, C174, C175, C176, C177, C178, C182, C183 

Information relating to these comments is included below. You will receive
notification of decisions on remaining comments at a later date.  Note, in
particular, that we are still working on C184 and C185.

You can find the latest version of the Character Model at
http://www.w3.org/International/Group/charmod-edit/ . 

Best regards,
Richard Ishida, for the I18N WG




DECISIONS REQUIRING A RESPONSE
==============================

****C116 Chris Lilley
   TAG
   Numbered conformance requirements
     * Comment (received 2002-05-27) -- [829]Comments on charmod from
       Chris
       Numbers for each conformance requirements clause would greatly aid
       referencing them.
     * Decision: We have classified this as editorial, and decided to
       reject it.
     * Rationale: Changes to the document would cause major problems. We
       might at a later stage add numbers if we feel that the document is
       stable enough, but we don't want to commit to it.

    [829] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html

****C117 Chris Lilley
   TAG
   The use, within the spec, of images of
   characters
     * Comment (received 2002-05-27) -- [832]Comments on charmod from
       Chris
       Please at least link to an accessible representation of 'foreign'
       characters rather than merely providing raster images of them. The
       text of this specification does not conform to itself, since it
       iuses bytes (pixels) to represent Unicode characters. Its also
       less than optimal wrt WAI guidelines. Apendix B is a lot better.
       But if the concern is to ensure correct rendering on legacy
       browsers, at least provide a link to the actual unicode sample, as
       characters and markup.
     * Decision: Partially accepted.
     * Rationale for 'Partially accepted': We have carefully reexamined
       the use of images, character numbers (U+...), character names, and
       actual characters, and made some corrections.
       We have based the choice of which mean(s) to use in each case on
       the amount of general support for the characters in question
       (Latin-1 being supported from the start of the Web, whereas Plane2
       not yet being widely available anywhere), and on the importance of
       visual, logical, or numerical information for the point being
       made, and have tried to make sure that there are two or more means
       of representation where appropriate.
       We would like to point out that to some extent, we have to deal
       with a bootstrap problem. As an example, both the Unicode Standard
       and the SVG spec use bitmap images as a way to 'ground' one
       technology in another.

    [832] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C118 Chris Lilley
   TAG
   XML 1.0 and 1.1 are non conforming
     * Comment (received 2002-05-27) -- [835]Comments on charmod from
       Chris
       Much of this document is a statement of existing good design
       practice. Many existing W3C specifications implement large parts
       of it. This is good. Care should be taken with MUSTs which make
       W3C Recs non-conforming. For example, XML 1.0 and 1.1 are non
       conforming.
     * Decision: Partially accepted.
     * Decision: Attempt to clarify terminology such as 'conforming';
       Improve text about code points in section 3.5.
     * Rationale for 'Partially accepted': We have attempted to clarify
       terminology such as "conforming"; (i.e. to indicate that
       preexisting technology only 'SHOULD' conform even when new one
       'MUST'; but this is now to some extent obsolete due to the fact
       that the application of Charmod to other specs will not be defined
       by Charmod itself, but rather by a TAG finding (we hope)).
       We have improved text in various instances where we thought that
       there might be a problem. We never had the intention to make XML
       1.0 or XML 1.1 non-conforming. We would be very glad to reexamine
       and fix any specific instance where you think that we (still) are
       saying that XML is not conforming if you can point out such
       specific instances to us.
       On the other hand, we wrote Charmod so that it not only applies to
       XML, but also to other, potentially new formats. We therefore
       tried to make sure to indicate best practice for such cases even
       if these might not always be exactly the same as what XML (to
       quite some extent for historical reasons) is doing. A typical
       example would be the use of both decimal and hexadecimal escape
       syntaxes in HTML and XML.

    [835] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C120 Chris Lilley
   TAG
   [840]3.1.5 Remove parts dealing with collation
   and sorting
     * Comment (received 2002-05-27) -- [842]Comments on charmod from
       Chris
       The portions about collation and sorting (for example 3.1.5 Units
       of collation) are sparse, vague, and anecdotal which contrasts
       strangely with the MUSTs; this section should be removed and
       returned for further work to produce a separate architectural
       specification on collation that has crisp, well thought out
       conformance criteria. The maturity of the collation parts does not
       match that of the 'character 101', normalization and URI reference
       parts.
     * Decision: Partially accepted.
     * Rationale for 'Partially accepted': We have modified the normative
       statements (changing from 'MUST' to 'SHOULD' and some wording
       changes). We disagree that the section on collation/sorting does
       not match the maturity of the other sections.
       In the context of Section 3.1, Perceptions of Characters, the fact
       that units of collation are different from other units, and the
       various issues, are important and well established. The text as
       well as the examples have been carefully chosen to show the range
       of phenomena. We do not see the need for a separate architectural
       document on collation and related issues; there are already an ISO
       standard and an Unicode Technical Standard, as well as many
       implementations, for user-oriented sorting/collation.


    [840] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-CollationUnits
    [842] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


   C125 S P C [869]Chris Lilley
   TAG
   P MD [870]3.6.3 [871]Go to Index 3.6.3 contradictory
     * Comment (received 2002-05-27) -- [872]Comments on charmod from
       Chris
       '[S] Specifications SHOULD NOT provide mechanisms for agreement on
       private use code points between parties and MUST NOT require the
       use of such mechanisms. '
       svg glyph with a unicode='&#xFE00;' is that a private agreement
       (aand hence in contravention)? If you disallow it, though, you
       break the following
       '[S] [I] Specifications and implementations SHOULD be designed in
       such a way as to not disallow the use of private use code points
       by private arrangement.'
       and in practice, duisallowing it would merely encourage mapping
       glyphs to the ascii code range wheras they should use the correct
       unicode code point or, if none, the PUA. Related point, avoid
       using character mechanisms for things that are not characters
       ('pi' fonts). Use small inline graphics instead.
     * Decision: Accepted.
       We agree with your concern about e.g. an svg glyph with an
       attribute unicode="&#xFE00;". We have changed the text somewhat,
       please check. However, we would like to point out that this svg
       mechanism is not designed for agreement on private use characters,
       it is designed for rendering of characters in general. It can be
       used for *rendering* of private-use characters, which may be
       appropriate or necessary in some cases.
       It could also be misused to completely change the rendering of
       some text (in the case of Chinese or Japanese easily to an extent
       that would completely change the meaning of the visually appearing
       text). While the use for private use characters could be checked,
       the use for completely changing the rendering could obviously not
       be checked by an SVG implementation.

    [869] mailto:chris@w3.org
    [870] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-PrivateUse
    [872] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C126 Chris Lilley
   TAG
   [874]3.7 Should XML allow NCRs everywhere?
     * Comment (received 2002-05-27) -- [876]Comments on charmod from
       Chris
       '[S] Escaped characters SHOULD be acceptable wherever unescaped
       characters are; this does not preclude that a syntax-significant
       character, when escaped, loses its significance in the syntax. In
       particular, escaped characters SHOULD be acceptable in identifiers
       and comments.'
       XML should allow NCRs everywhere, for example inside element and
       attribute names?
     * We have classified this as "Not applicable", because it was a
       question.
       Our answer is: Yes, in an ideal world, or if we ever got to redo
       XML, it would be preferable to allow NCRs e.g. in element and
       attribute names, because this leads to a more clearly layered
       encoding model. Indeed the I18N WG at one time was in contact with
       Jon Bosak and others (including members of the respective ISO
       committee) to investigate the possibility of such a change. As
       explained under #C118, this does not mean that XML is
       non-conformant, nor that it should be changed. But it is important
       to note this experience for any new formats. We would also like to
       note that CSS and Java do it this way.

    [874] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-Escaping
    [876] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C127 Chris Lilley
   TAG
   [878]8 Say that the IRI form is used in the
   document instance and the hexified URI form when it goes over the wire
     * Comment (received 2002-05-27) -- [880]Comments on charmod from
       Chris
       '[S] W3C specifications MUST define when the conversion from IRI
       references to URI references (or subsets thereof) takes place, in
       accordance with Internationalized Resource Identifiers (IRI) [I-D
       IRI].'
       Why not go further and say that the IRI form is used in the
       document instance and the hexified URI form when it goes over the
       wire? It would be bad if different XML namespaces defined
       different processing here.
     * Decision: Rejected.
     * Rationale: We do not want to preclude the direct use of IRIs by
       wire protocols. Whether to use URIs or IRIs is defined by the wire
       protocol in question. HTTP currently defines to use URIs, a new
       version of HTTP (if ever needed) or some other protocol may use
       IRIs. Similar considerations apply to documents formats, some
       document formats in some 'slots' may allow IRIs, whereas others
       don't.

    [878] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-URIs
    [880] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C128 Chris Lilley
   TAG
   [882]9 Referencing the Unicode Standard and
   ISO/IEC 10646
     * Comment (received 2002-05-27) -- [884]Comments on charmod from
       Chris
       'Conformance to Unicode implies conformance to ISO/IEC 10646, see
       [Unicode 3.0] Appendix C.
       [S] Since specifications in general need both a definition for
       their characters and the semantics associated with these
       characters, specifications SHOULD include a reference to the
       Unicode Standard, whether or not they include a reference to
       ISO/IEC 10646. By providing a reference to The Unicode Standard
       implementers can benefit from the wealth of information provided
       in the standard and on the Unicode Consortium Web site.'
       That is a bit weak. Say explicitly that a reference to 10646
       without a reference to Unicode implies no character semantics, no
       bidi processing no character case information etc etc. Also, since
       one is a strict superset of the other, provide a rationale why a
       specification should ever provide a reference to 10646 since a
       reference to Unicode exactly covers the same CCS?
     * Decision: Rejected.
     * Rationale: The current language is the result of careful
       deliberation and compromise. The situation is not as simple as you
       describe it. ISO 10646 and Unicode are as good as the other at
       giving the "LATIN SMALL LETTER A" the semantics of 'latin small
       letter a'. Also, ISO 10646 actually contains a normative reference
       to Unicode's bidi algorithm, and some other stuff in Unicode.

    [882] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-RefUnicode
    [884] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C175 Chris Lilley
   TAG
   [1174]3.1.3 Units of visual rendering
     * Comment (received 2002-05-27) -- [1176]Comments on charmod from
       Chris
       'Logical selection looks like this:'
       There should be a requirement after that [...]
       [S][I] Specifications of protocols and APIs that involve selection
       of ranges MUST provide for text selection in logical selection
       mode.
     * Decision: Rejected.
     * Rationale: We think that this is already covered by:
       [S] Protocols, data formats and APIs MUST store, interchange or
       process text data in logical order.

    [1174]
http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits
    [1176] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C176 Chris Lilley
   TAG
   [1178]3.1.3 Units of visual rendering
     * Comment (received 2002-05-27) -- [1180]Comments on charmod from
       Chris
       Also, should there not be something about copying that selection
       and pasting it somewhere else, that what you get is the logical
       selection?
     * Decision: Rejected.
     * Rationale: We rejected this comment. What if the paste is into a
       bitmap editor? Also, if it's a visual selection, then
       copying/pasting should paste the characters selected in the visual
       selection, rather than those in a corresponding logical selection
       between the same end points.

    [1178]
http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits
    [1180] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C177 Chris Lilley
   TAG
   [1182]3.1.3 Units of visual rendering
     * Comment (received 2002-05-27) -- [1184]Comments on charmod from
       Chris
       Similarly in the next part, I suggest rewording to remove the
       ambiguous phrase:
       [S] Specifications of protocols and APIs that involve selection of
       ranges SHOULD provide for text selection in logical selection
       mode, at least to the extent necessary to support implementation
       of visual selection on screen on top of those protocols and APIs.
     * Decision: Rejected.
     * Rationale: The original paragraph is about visual selection, not
       logical. Visual selection requires discontiguous logical ranges
       and the requirement is for protocols and APIs to provide the
       latter.

    [1182]
http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits
    [1184] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C178 Chris Lilley
   TAG
   [1186]3.1.3 Units of visual rendering
     * Comment (received 2002-05-27) -- [1188]Comments on charmod from
       Chris
       Its not clear that this is such a strong requirement and it
       complicates processing, especially on handheld devices. Perhaps
       weaken to MAY? And say what happens when this funky visual
       selection getc copied and pasted - do you get a set of separate
       logical selections (if so how delimited)? A single visually
       ordered selection (yuk)? Something else?
       Otherwise, the weaker requirement for contiguous visual selection
       is likely to merely encourage the use of visual storage or the
       disposal of logical storage once the visual result has been
       generated. Which would lead to text copied from visualy contiguous
       (logically discontiguous) selections being stored in visual order.
       Which is to be avoided.
     * Decision: Rejected.
     * Rationale: First, visual storage and visual selection are
       independent of each other. We think it's important that protocols
       and APIs SHOULD support discontiguous logical ranges so that
       implementations MAY implement visual selection if they wish. This
       is in particular relevant for technologies such as XPointer. We do
       not think that this will lead to the use of visual ordering inside
       the selection. In situations such as cut/paste without special
       support, the visual selection is usually copied as as sequence of
       segments, all internally in logical order. The sequence of
       segments and other things may be implementation-dependent, and in
       advanced applications, the overall result may depend on where the
       insertion is made.

    [1186]
http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits
    [1188] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C182 Chris Lilley
   TAG
   [1207]3.6.2 Character encoding identification
     * Comment (received 2002-05-27) -- [1209]Comments on charmod from
       Chris
       '[S] If the unique encoding approach is not chosen, specifications
       MUST designate at least one of the UTF-8 and UTF-16 encoding forms
       of Unicode as admissible encodings and SHOULD choose at least one
       of UTF-8 or UTF-16 as mandated encoding forms (encoding forms that
       MUST be supported by implementations of the specification).'
       Does that mean that, for example, saying UTF-8 is allowed and
       UTF-16 is disallowed and an encoding declaration is not required,
       is okay?
     * Answer: Yes.
     * Decision: We have classified this as "Not applicable", because it
       was a question.
       Our answer is "yes". This should be understood in light of our
       comments to [1210]C118. It is not meant to change the rules of
       specific existing formats or protocols, but to give guidance to
       new formats or protocols.

    [1207] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-EncodingIdent
    [1209] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html


****C183 Chris Lilley
   TAG
   [1212]3.6.2 Character encoding identification
     * Comment (received 2002-05-27) -- [1214]Comments on charmod from
       Chris
       Needs a little more on encodings that are a group of similar but
       not identical encodings, for example shift-jis.
     * Decision: Rejected.
     * Rationale: We have rejected this comment, because this is already
       mentioned. But as a result of other editing, the relevant note is
       now in a very prominent position just after the opening paragraph.
       If you think this is not enough, please provide concrete
       suggestions on what you think is missing.

    [1212] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-EncodingIdent
    [1214] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html




USEFUL LINKS
==============
[1] The version of CharMod you commented on: 
http://www.w3.org/TR/2002/WD-charmod-20020430/
[2] Latest editor's version (still being edited): 
http://www.w3.org/International/Group/charmod-edit/
[3] Last Call comments table, sorted by ID: 
http://www.w3.org/International/Group/2002/charmod-lc/
Received on Friday, 16 January 2004 03:21:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:23 GMT