RE: No Character Normalization? from Kevin Regan on 2000-06-23 (w3c-ietf-xmldsig@w3.org from April to June 2000)

From: Kevin Regan <kevinr@valicert.com>
Date: Fri, 23 Jun 2000 14:57:50 -0700 (PDT)
To: jboyer@PureEdge.com, w3c-ietf-xmldsig@w3.org
Message-id: <Pine.SOL.4.21.0006231439310.10341-100000@bugs.valicert.com>

John,

Thanks for the information.

My greatest concern is to not have to tell my customer that "No, I
can't sign that.  How did you create that document anyway?"

If it is the usual case that documents are created in the normalized
form, then it does not seem like a big issue.  What would happen
in the case of an editor or application written in Java (Unicode)?
It seems that this is the most important case given the close
coupling of Java and XML.

Another concern is whether a document can become "de-normalized" during
transmission.  My previous question was not specific enough. I understand
that documents can be converted to other character formats. However, I'm
wondering if a document can leave one application in a normalized form, go
through various character encodings, and enter another application
with the characters no longer normalized (e.g.  A Java application to Java
application might go from Unicode, to UTF-8 for transmission, and then
back to Unicode in the other application).

Finally, you mention that the detection of a non-normalized document
would aid in the discovery of forgery.  My question is: should similar
documents with different character models be equivalent?
What would most people expect?  I don't really understand the usage
enough to have an opinion on this...

--Kevin

Received on Friday, 23 June 2000 17:57:37 UTC