- From: Richard Ishida <ishida@w3.org>
- Date: Fri, 16 Jan 2004 08:21:05 -0000
- To: <chris@w3.org>
- Cc: <www-tag@w3.org>, <www-i18n-comments@w3.org>
Dear Chris, Many thanks for your comments on the 2nd Last Call version of the Character Model for the World Wide Web v1.0 [1]. We appreciate the interest you have taken in this specification. You can see the comments you submitted, grouped together, at http://www.w3.org/International/Group/2002/charmod-lc/SortByOriginator.html# C116 (You can jump to a specific comment in the table by adding its ID to the end of the URI.) The following comments were accepted and edits were made along the lines you suggested. We do not need you to comment on the edits made, but if you wish to, please reply to us within the next two weeks at mailto:www-i18n-comments@w3.org and copy w3c-i18n-ig@w3.org. C119, C122, C123, C179 PLEASE REVIEW the decisions for the following additional comments and reply to us within the next two weeks at mailto:www-i18n-comments@w3.org (copying w3c-i18n-ig@w3.org) to say whether you are satisfied with the decision taken. C116, C117, C118, C120,C125, C126, C127, C128, C174, C175, C176, C177, C178, C182, C183 Information relating to these comments is included below. You will receive notification of decisions on remaining comments at a later date. Note, in particular, that we are still working on C184 and C185. You can find the latest version of the Character Model at http://www.w3.org/International/Group/charmod-edit/ . Best regards, Richard Ishida, for the I18N WG DECISIONS REQUIRING A RESPONSE ============================== ****C116 Chris Lilley TAG Numbered conformance requirements * Comment (received 2002-05-27) -- [829]Comments on charmod from Chris Numbers for each conformance requirements clause would greatly aid referencing them. * Decision: We have classified this as editorial, and decided to reject it. * Rationale: Changes to the document would cause major problems. We might at a later stage add numbers if we feel that the document is stable enough, but we don't want to commit to it. [829] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C117 Chris Lilley TAG The use, within the spec, of images of characters * Comment (received 2002-05-27) -- [832]Comments on charmod from Chris Please at least link to an accessible representation of 'foreign' characters rather than merely providing raster images of them. The text of this specification does not conform to itself, since it iuses bytes (pixels) to represent Unicode characters. Its also less than optimal wrt WAI guidelines. Apendix B is a lot better. But if the concern is to ensure correct rendering on legacy browsers, at least provide a link to the actual unicode sample, as characters and markup. * Decision: Partially accepted. * Rationale for 'Partially accepted': We have carefully reexamined the use of images, character numbers (U+...), character names, and actual characters, and made some corrections. We have based the choice of which mean(s) to use in each case on the amount of general support for the characters in question (Latin-1 being supported from the start of the Web, whereas Plane2 not yet being widely available anywhere), and on the importance of visual, logical, or numerical information for the point being made, and have tried to make sure that there are two or more means of representation where appropriate. We would like to point out that to some extent, we have to deal with a bootstrap problem. As an example, both the Unicode Standard and the SVG spec use bitmap images as a way to 'ground' one technology in another. [832] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C118 Chris Lilley TAG XML 1.0 and 1.1 are non conforming * Comment (received 2002-05-27) -- [835]Comments on charmod from Chris Much of this document is a statement of existing good design practice. Many existing W3C specifications implement large parts of it. This is good. Care should be taken with MUSTs which make W3C Recs non-conforming. For example, XML 1.0 and 1.1 are non conforming. * Decision: Partially accepted. * Decision: Attempt to clarify terminology such as 'conforming'; Improve text about code points in section 3.5. * Rationale for 'Partially accepted': We have attempted to clarify terminology such as "conforming"; (i.e. to indicate that preexisting technology only 'SHOULD' conform even when new one 'MUST'; but this is now to some extent obsolete due to the fact that the application of Charmod to other specs will not be defined by Charmod itself, but rather by a TAG finding (we hope)). We have improved text in various instances where we thought that there might be a problem. We never had the intention to make XML 1.0 or XML 1.1 non-conforming. We would be very glad to reexamine and fix any specific instance where you think that we (still) are saying that XML is not conforming if you can point out such specific instances to us. On the other hand, we wrote Charmod so that it not only applies to XML, but also to other, potentially new formats. We therefore tried to make sure to indicate best practice for such cases even if these might not always be exactly the same as what XML (to quite some extent for historical reasons) is doing. A typical example would be the use of both decimal and hexadecimal escape syntaxes in HTML and XML. [835] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C120 Chris Lilley TAG [840]3.1.5 Remove parts dealing with collation and sorting * Comment (received 2002-05-27) -- [842]Comments on charmod from Chris The portions about collation and sorting (for example 3.1.5 Units of collation) are sparse, vague, and anecdotal which contrasts strangely with the MUSTs; this section should be removed and returned for further work to produce a separate architectural specification on collation that has crisp, well thought out conformance criteria. The maturity of the collation parts does not match that of the 'character 101', normalization and URI reference parts. * Decision: Partially accepted. * Rationale for 'Partially accepted': We have modified the normative statements (changing from 'MUST' to 'SHOULD' and some wording changes). We disagree that the section on collation/sorting does not match the maturity of the other sections. In the context of Section 3.1, Perceptions of Characters, the fact that units of collation are different from other units, and the various issues, are important and well established. The text as well as the examples have been carefully chosen to show the range of phenomena. We do not see the need for a separate architectural document on collation and related issues; there are already an ISO standard and an Unicode Technical Standard, as well as many implementations, for user-oriented sorting/collation. [840] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-CollationUnits [842] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html C125 S P C [869]Chris Lilley TAG P MD [870]3.6.3 [871]Go to Index 3.6.3 contradictory * Comment (received 2002-05-27) -- [872]Comments on charmod from Chris '[S] Specifications SHOULD NOT provide mechanisms for agreement on private use code points between parties and MUST NOT require the use of such mechanisms. ' svg glyph with a unicode='︀' is that a private agreement (aand hence in contravention)? If you disallow it, though, you break the following '[S] [I] Specifications and implementations SHOULD be designed in such a way as to not disallow the use of private use code points by private arrangement.' and in practice, duisallowing it would merely encourage mapping glyphs to the ascii code range wheras they should use the correct unicode code point or, if none, the PUA. Related point, avoid using character mechanisms for things that are not characters ('pi' fonts). Use small inline graphics instead. * Decision: Accepted. We agree with your concern about e.g. an svg glyph with an attribute unicode="︀". We have changed the text somewhat, please check. However, we would like to point out that this svg mechanism is not designed for agreement on private use characters, it is designed for rendering of characters in general. It can be used for *rendering* of private-use characters, which may be appropriate or necessary in some cases. It could also be misused to completely change the rendering of some text (in the case of Chinese or Japanese easily to an extent that would completely change the meaning of the visually appearing text). While the use for private use characters could be checked, the use for completely changing the rendering could obviously not be checked by an SVG implementation. [869] mailto:chris@w3.org [870] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-PrivateUse [872] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C126 Chris Lilley TAG [874]3.7 Should XML allow NCRs everywhere? * Comment (received 2002-05-27) -- [876]Comments on charmod from Chris '[S] Escaped characters SHOULD be acceptable wherever unescaped characters are; this does not preclude that a syntax-significant character, when escaped, loses its significance in the syntax. In particular, escaped characters SHOULD be acceptable in identifiers and comments.' XML should allow NCRs everywhere, for example inside element and attribute names? * We have classified this as "Not applicable", because it was a question. Our answer is: Yes, in an ideal world, or if we ever got to redo XML, it would be preferable to allow NCRs e.g. in element and attribute names, because this leads to a more clearly layered encoding model. Indeed the I18N WG at one time was in contact with Jon Bosak and others (including members of the respective ISO committee) to investigate the possibility of such a change. As explained under #C118, this does not mean that XML is non-conformant, nor that it should be changed. But it is important to note this experience for any new formats. We would also like to note that CSS and Java do it this way. [874] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-Escaping [876] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C127 Chris Lilley TAG [878]8 Say that the IRI form is used in the document instance and the hexified URI form when it goes over the wire * Comment (received 2002-05-27) -- [880]Comments on charmod from Chris '[S] W3C specifications MUST define when the conversion from IRI references to URI references (or subsets thereof) takes place, in accordance with Internationalized Resource Identifiers (IRI) [I-D IRI].' Why not go further and say that the IRI form is used in the document instance and the hexified URI form when it goes over the wire? It would be bad if different XML namespaces defined different processing here. * Decision: Rejected. * Rationale: We do not want to preclude the direct use of IRIs by wire protocols. Whether to use URIs or IRIs is defined by the wire protocol in question. HTTP currently defines to use URIs, a new version of HTTP (if ever needed) or some other protocol may use IRIs. Similar considerations apply to documents formats, some document formats in some 'slots' may allow IRIs, whereas others don't. [878] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-URIs [880] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C128 Chris Lilley TAG [882]9 Referencing the Unicode Standard and ISO/IEC 10646 * Comment (received 2002-05-27) -- [884]Comments on charmod from Chris 'Conformance to Unicode implies conformance to ISO/IEC 10646, see [Unicode 3.0] Appendix C. [S] Since specifications in general need both a definition for their characters and the semantics associated with these characters, specifications SHOULD include a reference to the Unicode Standard, whether or not they include a reference to ISO/IEC 10646. By providing a reference to The Unicode Standard implementers can benefit from the wealth of information provided in the standard and on the Unicode Consortium Web site.' That is a bit weak. Say explicitly that a reference to 10646 without a reference to Unicode implies no character semantics, no bidi processing no character case information etc etc. Also, since one is a strict superset of the other, provide a rationale why a specification should ever provide a reference to 10646 since a reference to Unicode exactly covers the same CCS? * Decision: Rejected. * Rationale: The current language is the result of careful deliberation and compromise. The situation is not as simple as you describe it. ISO 10646 and Unicode are as good as the other at giving the "LATIN SMALL LETTER A" the semantics of 'latin small letter a'. Also, ISO 10646 actually contains a normative reference to Unicode's bidi algorithm, and some other stuff in Unicode. [882] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-RefUnicode [884] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C175 Chris Lilley TAG [1174]3.1.3 Units of visual rendering * Comment (received 2002-05-27) -- [1176]Comments on charmod from Chris 'Logical selection looks like this:' There should be a requirement after that [...] [S][I] Specifications of protocols and APIs that involve selection of ranges MUST provide for text selection in logical selection mode. * Decision: Rejected. * Rationale: We think that this is already covered by: [S] Protocols, data formats and APIs MUST store, interchange or process text data in logical order. [1174] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits [1176] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C176 Chris Lilley TAG [1178]3.1.3 Units of visual rendering * Comment (received 2002-05-27) -- [1180]Comments on charmod from Chris Also, should there not be something about copying that selection and pasting it somewhere else, that what you get is the logical selection? * Decision: Rejected. * Rationale: We rejected this comment. What if the paste is into a bitmap editor? Also, if it's a visual selection, then copying/pasting should paste the characters selected in the visual selection, rather than those in a corresponding logical selection between the same end points. [1178] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits [1180] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C177 Chris Lilley TAG [1182]3.1.3 Units of visual rendering * Comment (received 2002-05-27) -- [1184]Comments on charmod from Chris Similarly in the next part, I suggest rewording to remove the ambiguous phrase: [S] Specifications of protocols and APIs that involve selection of ranges SHOULD provide for text selection in logical selection mode, at least to the extent necessary to support implementation of visual selection on screen on top of those protocols and APIs. * Decision: Rejected. * Rationale: The original paragraph is about visual selection, not logical. Visual selection requires discontiguous logical ranges and the requirement is for protocols and APIs to provide the latter. [1182] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits [1184] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C178 Chris Lilley TAG [1186]3.1.3 Units of visual rendering * Comment (received 2002-05-27) -- [1188]Comments on charmod from Chris Its not clear that this is such a strong requirement and it complicates processing, especially on handheld devices. Perhaps weaken to MAY? And say what happens when this funky visual selection getc copied and pasted - do you get a set of separate logical selections (if so how delimited)? A single visually ordered selection (yuk)? Something else? Otherwise, the weaker requirement for contiguous visual selection is likely to merely encourage the use of visual storage or the disposal of logical storage once the visual result has been generated. Which would lead to text copied from visualy contiguous (logically discontiguous) selections being stored in visual order. Which is to be avoided. * Decision: Rejected. * Rationale: First, visual storage and visual selection are independent of each other. We think it's important that protocols and APIs SHOULD support discontiguous logical ranges so that implementations MAY implement visual selection if they wish. This is in particular relevant for technologies such as XPointer. We do not think that this will lead to the use of visual ordering inside the selection. In situations such as cut/paste without special support, the visual selection is usually copied as as sequence of segments, all internally in logical order. The sequence of segments and other things may be implementation-dependent, and in advanced applications, the overall result may depend on where the insertion is made. [1186] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-VisualRenderingUnits [1188] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C182 Chris Lilley TAG [1207]3.6.2 Character encoding identification * Comment (received 2002-05-27) -- [1209]Comments on charmod from Chris '[S] If the unique encoding approach is not chosen, specifications MUST designate at least one of the UTF-8 and UTF-16 encoding forms of Unicode as admissible encodings and SHOULD choose at least one of UTF-8 or UTF-16 as mandated encoding forms (encoding forms that MUST be supported by implementations of the specification).' Does that mean that, for example, saying UTF-8 is allowed and UTF-16 is disallowed and an encoding declaration is not required, is okay? * Answer: Yes. * Decision: We have classified this as "Not applicable", because it was a question. Our answer is "yes". This should be understood in light of our comments to [1210]C118. It is not meant to change the rules of specific existing formats or protocols, but to give guidance to new formats or protocols. [1207] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-EncodingIdent [1209] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html ****C183 Chris Lilley TAG [1212]3.6.2 Character encoding identification * Comment (received 2002-05-27) -- [1214]Comments on charmod from Chris Needs a little more on encodings that are a group of similar but not identical encodings, for example shift-jis. * Decision: Rejected. * Rationale: We have rejected this comment, because this is already mentioned. But as a result of other editing, the relevant note is now in a very prominent position just after the opening paragraph. If you think this is not enough, please provide concrete suggestions on what you think is missing. [1212] http://www.w3.org/TR/2002/WD-charmod-20020430/#sec-EncodingIdent [1214] http://lists.w3.org/Archives/Public/www-tag/2002May/0164.html USEFUL LINKS ============== [1] The version of CharMod you commented on: http://www.w3.org/TR/2002/WD-charmod-20020430/ [2] Latest editor's version (still being edited): http://www.w3.org/International/Group/charmod-edit/ [3] Last Call comments table, sorted by ID: http://www.w3.org/International/Group/2002/charmod-lc/
Received on Friday, 16 January 2004 03:21:08 UTC