W3C home > Mailing lists > Public > www-archive@w3.org > August 2002

RE: Algorithm for mapping an application defined name to an XML name

From: Addison Phillips [wM] <aphillips@webmethods.com>
Date: Fri, 23 Aug 2002 16:05:04 -0700
To: "Martin Gudgin" <mgudgin@microsoft.com>, <asirv@webmethods.com>
Cc: "W3C Public Archive" <www-archive@w3.org>, "Jean-Jacques Moreau" <moreau@crf.canon.fr>, "Marc Hadley" <marc.hadley@sun.com>, "Nilo Mitra" <EUSNILM@am1.ericsson.se>, "Noah Mendelson" <noah_mendelsohn@us.ibm.com>, "Henrik Frystyk Nielsen" <henrikn@microsoft.com>
Message-ID: <PNEHIBAMBMLHDMJDDFLHKECLFKAA.aphillips@webmethods.com>

Hi Martin,

Thanks for the note. It's been awhile since I thought about this.

My edits were done from the original proposal. Although I modified the text
to be more correct about various Unicode issues, I didn't change the
structure of the original at all. (FWIW, I would have designed and written
it differently. And I hate standards that obfuscate what's going on as much
as this one does. It not being my document, I didn't rewrite it. I just
edited the text to be more correct.)

The TAG item is confusing. Therefore I think the use of the word "TAG" in
production rule 2 should be replaced with "T". I would rewrite #2 slightly
differently than Martin does to be:

"2. Let <i>T</i> be a name in an application, where the name <i>T</i> is a
sequence of characters in the character set, encoding, and namespace of the
application."  // note addition of the encoding here

For Martin Durst's comments, I note that he appears to be commenting on the
original draft. My comments on his comments are:

1. Yes, it needs some explanatory text.
2. I didn't change "Prefix be computed, etc." as it was beyond the scope of
my review. If it is looked up instead of being computed, you could change it
to say that. Basically, the appendix is saying: "get your prefix somehow,
this stuff deals with the 'local part'"
3. The left-to-right thing was fixed by using most-to-least significant
(e.g. memory order).
4. Already commented on TAG. By all means change that.
5. You could change M to UNI if you want. I kept the original notation here.
Doesn't matter which you choose.
6. Yes, the note should be added. I missed that on my edit. Note that
Martin's point about missing characters is dealt with (but not very clearly)
by 5.2.
7. The various edits, etc., have to do with changing the structure of the
document as Martin suggests. I'm not wild about the structure either, as
indicated. You could follow his edits which do not change the end result.
8. Say explicitly that hex digits always appear as uppercase.
9. Add examples if you so desire.

For Mike Champion's comments, I note that he also appears to be commenting
on the draft I revised???:

1. Referencing UCS-4, an obsolete encoding of Unicode, is one of the things
I changed. I used Unicode Scalar Value (that is, code points) to get away
from the vagaries of the different encodings. Although UTF-32/UCS-4 are
essentially the scalar value, there is no need to get into things like
Big/LittleEndianness and other stuff.

I should note that there are non-characters inside the 0..10FFFF range.
Saying USVs avoid those without a lot of text to explain it, so long as you
look up the precise Unicode definition of all this stuff (for example in

2. The use of U+ notation is slightly confusing in the text. I would change
this sequence (at

"Let U1, U2, ... , U8 be the eight hex digits [PROD: 11] such that Ci is
"U+" U1 U2 ... U8 in the Unicode Scalar Value"

to be:

"Let U1, U2, ..., U8 be the eight hex digits [PROD: 11] in the 32-bit
Unicode Scalar Value of Ci. For example, a character whose scalar value is
U+10FFFA would be represented by the sequence U1=0 U2=0 U3=1 U4=0 U5=F U6=F
U7=F U8=A ('0010FFFA')"

Note that this makes clear that the encoding is wasteful. The first two
bytes will never be used for a value other than "00".

Good luck with your editing.

Best Regards,


Addison P. Phillips
Director, Globalization Architecture
webMethods, Inc.
432 Lakeside Drive
Sunnyvale, California, USA
+1 408.962.5487 (phone)
+1 408.210.3569 (mobile)
Internationalization is an architecture.
It is not a feature.

> -----Original Message-----
> From: Martin Gudgin [mailto:mgudgin@microsoft.com]
> Sent: Friday, August 23, 2002 6:23 AM
> To: asirv@webmethods.com; aphillips@webmethods.com
> Cc: W3C Public Archive; Jean-Jacques Moreau; Marc Hadley; Nilo Mitra;
> Noah Mendelson; Henrik Frystyk Nielsen
> Subject: Algorithm for mapping an application defined name to an XML
> name
> Gentlemen,
> You were kind enough to provide a rewritten version[5] of the Mapping
> Application Defined Name to XML Name Appendix[1] of SOAP 1.2 Part 2[2].
> There have been further comments on this section from Martin Duerst[3]
> and Mike Champion[4]. The editors are in the midst of incorporating your
> changes and would like to know if these comments have any bearing on the
> changes to be made. Specifically this editor would like to know if your
> proposed changes address the concerns of Messrs Duerst and Champion.
> Your prompt reply would be appreciated
> Regards
> Martin Gudgin
> [1] http://www.w3.org/TR/2002/WD-soap12-part2-20020626/#namemap
> [2] http://www.w3.org/TR/2002/WD-soap12-part2-20020626/
> [3] http://www.w3.org/2000/xp/Group/xmlp-lc-issues.html#x270
> [4] http://www.w3.org/2000/xp/Group/xmlp-lc-issues.html#x341
> [5]
Received on Friday, 23 August 2002 19:05:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:31:52 UTC