Re: I18N WG/IG last call comments

Forwarded Text ----
 Date: Wed, 19 Apr 2000 11:33:38 +0900
 To: "Joseph M. Reagle Jr." <reagle@w3.org>
 From: "Martin J. Duerst" <duerst@w3.org>
 Subject: Re: I18N WG/IG last call comments
 Cc: w3c-i18n-ig@w3.org
 In-Reply-To: <3.0.5.32.20000418190721.00a08500@localhost>
 Status:   
 
...
 
 At 00/04/18 19:07 -0400, Joseph M. Reagle Jr. wrote:
 >At 12:27 00/03/25 +0900, Martin J. Duerst wrote:
 >  >The W3C I18N WG and IG have reviewed your last call draft.
 >  >Below please find our comments. We look forward to collaborating
 >  >with you to resolve them.
 >
 >Martin,
 >
 >Thank you (and the I18N) for the thoughtful questions, a couple quick
 >questions from the point of view of understanding the comments.
 >Deliberations on other comments are on-going.
 >
 >  >Character encoding and transcoding
 >  >----------------------------------
 >  >
 >  >[Transcoding is the conversion from one character encoding
 >  >(charset) to another.]
 >  >
 >  >- 'minimal' canonicalization is required, but it should be made
 >  >   very clear that this does not imply that conversion from all
 >  >   'charset's to UTF-8 is required. A set of 'charset's for which
 >  >   support is required should be defined exactly, e.g. as UTF-8
 >  >   and UTF-16. This is the same for other transforms.
 >
 >Why? I'm no expert as to whether this would be a good or bad thing, but I
 >believe the spec does require this:
 >
 >         converts the character encoding to UTF-8, removing the
 >         encoding pseudo-attribute
 >
http://www.w3.org/Signature/Drafts/WD-xmldsig-core-200003plc/#minimal
 
 It's the question of which *input* encodings are required.
 It's clear there is only one output encoding.
 
 
 
 
 >  >   As an example, using the above 'case' analogy, take a document
 >  >   <root>
 >  >    <amount>$10</amount>
 >  >    <amount>$1000</amount>
 >  >   <root>
 >  >   which is modified by an intruder to look like
 >  >   <root>
 >  >    <Amount>$10</Amount>
 >  >    <amount>$1000</amount>
 >  >   <root>
 >  >   and combine this with a DOM program that extracts the first
 >  >   <amount> and pays somebody that much. After the change by
 >  >   the intruder, the amount actually paid is $1000 instead of $10.
 >
 >This is just an example right? As XML is case sensitive and these would be
 >different InfoItems. The more approriate (though hard to show case) is of
 >character composition and decomposition.
 
 Yes, exactly.
 
 
 >  >- Section 6.6.3.3 Function Library Additions, para 2
 >  >
 >  >       "CDATA sections are replaced by their content"
 >  >
 >  >   This requires the processing to behave as if it uses the UCS.
 >
 >John might have already fixed this, but what are you recommending?
 
 My guess is that this should be okay by now. I think the original
 comment was based on the impression that XPath processing was
 assumed by John to happen in whatever encoding the inputs were.
 
 Regards,   Martin.
 
End Forwarded Text ----

_________________________________________________________
Joseph Reagle Jr.   
W3C Policy Analyst                mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair   http://www.w3.org/People/Reagle/

Received on Wednesday, 19 April 2000 08:31:12 UTC