- From: Grosso, Paul <pgrosso@ptc.com>
- Date: Wed, 20 Jun 2007 16:11:14 -0400
- To: "Martin Duerst" <duerst@it.aoyama.ac.jp>
- Cc: <public-iri@w3.org>, "Richard Ishida" <ishida@w3.org>, "Felix Sasaki" <fsasaki@w3.org>, <www-xml-linking-comments@w3.org>, <public-xml-core-wg@w3.org>, <public-i18n-core@w3.org>
Martin, The XML Core WG discussed this message of yours during our telcon today. I'd like to thank you for your input and give some preliminary responses. [We have only just now noticed your email at http://lists.w3.org/Archives/Public/public-iri/2007May/0000 that most of us on the XML Core WG never saw before, so we have not yet discussed those points.] [I'm not sure I have permission to cross post to all the various lists, but I hesitate to remove anyone, so we'll have to see how this works.] > -----Original Message----- > From: Martin Duerst [mailto:duerst@it.aoyama.ac.jp] > Sent: Tuesday, 2007 June 19 20:24 > To: Grosso, Paul > Cc: public-iri@w3.org; Richard Ishida; Felix Sasaki; > www-xml-linking-comments@w3.org; public-xml-core-wg@w3.org; > public-i18n-core@w3.org > Subject: RE: Fwd: Re: HRRIs, IRIs, etc > In the IRI spec, these are excluded: > ucschar = %xA0-D7FF / %xF900-FDCF / %xFDF0-FFEF > / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD > / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD > / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD > / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD > / %xD0000-DFFFD / %xE1000-EFFFD > > so you have to add them to your list in section 3. We'll plan to add them. > - The overall usefulness (seen from a overall W3C or overall IETF > standpoint) of having separate definitions, in separate documents, > for two essentially extremely closely related protocol elements. > [I have proposed to integrate your material into an update of the > IRI spec.] The XML Base PER went out in December, and the XLink 1.1 CR ended a year ago (July 2007), and these are both awaiting resolution of this issue. Both the basic idea as well as most of the actual wording for what we are now calling HRRIs currently exist in several Recs including XML, XLink, XML Base, and maybe others. Our attempt here was just to pull that wording out if the various specs and reference a definition in one place. We were hoping to to this in an expeditious manner. We discussed the options with our team contact who discussed it with W3T, and we agreed that a short RFC was the best approach. > > - The choice of name, which is highly suggestive instead of > descriptive, > inappropriate on several accounts (for the largest part of > URIs/IRIs, > HRRIs are only marginally more readable, if at all, and the overall > syntax still poses a lot of problems for average human > users (http://...). We had a hard time coming up with a name ourselves, and we'd consider another name if we can find one more generally acceptable. We do think that allowing spaces (as is the case with HRRIs) does improve readability a bit, but we'd be happy with any name that works. We had called these XML Resource Identifiers earlier, but (1) the XRI acronym is already taken and (2) these have meaning and usefulness outside of XML. If anyone has suggestions, we're interested in considering them. > > - The overall description. I note e.g. the following: > "However, it is often inconvenient for authors to encode > these characters." > How often? Unless somebody is authoring a lot of XPointers by hand, > this can't happen that often (maybe with the exception of the space, > but then you discurage that (correctly!) yourself). > I suggest to reword "often" to "occasionally". There are similar > examples elsewhere. As you say, the space character is the most common. We would be happy to tone down or remove this sentence; our motivation for defining HRRIs is not that they are a good thing but that they already exist in multiple standards. > > - The classification as a BCP. Procedurally, it's unclear to > me why the > IETF would classify a protocol element spec as a BCP when > the related > ones (URI, IRI) are standards track. Content-wise, it's unclear why > the IETF would call something a BEST current practice if in earlier > discussion, they have clearly preferred to disallow or marginalize > this practice (and that was only for spaces and such, not > for controls). I think this may be a "typo". I believe we intended this to become an RFC. Actually, we don't care what it becomes as long as it is referenceable from the various W3C XML-related specs. > > - The security section now mentions the issues with control > characters. > This should definitely be a bit more specific, and should contain > explicit recommendations. I'd write that receivers may want to > filter out such characters, or URIs with such characters, and > therefore including them in the first place is discouraged. Most of us in the XML Core WG don't feel as strongly as you appear to that we need to go on at great length about security issues, but we are happy to expand this section along the lines you suggest. > > - You have some advice against using raw spaces ("Also, > authors of HRRIs > are advised to percent encode space characters themselves, > rather than > rely on the processor to do so, because spaces are often used to > separate HRRIs in a sequence"), but not against others, > where similar > arguments apply: > - tabs and CR/LF are removed/merged/coverted to spaces in > attribute values > (merging also occurrs for spaces) > - <> are often used to delimit URIs/IRIs > - arbitrary controls may trigger some security filter > - private use characters are not interoperable > - non-characters (the list above) are discouraged in XML itself > (not sure this list is complete, but I guess it's getting close) These are all good points. We will expand the document along the lines you suggest. > > - The last paragraph of Section 3 is somewhat problematic. In general, > it's okay, but the second half of the last sentence > ("nor the process of passing a Human Readable Resource > Identifier to a > process or software component responsible for > dereferencing it SHOULD > trigger percent encoding") may suggest that resolution > interfaces come > with three different entry points. I think it would be > better to have > done this work by the XML side when resolving something. The above quoted phrase is in the XLink 1.1 CR, but we are not sure at this time exactly why it is in there. We are discussing this and will try to figure out what to do about this wording and let you know. paul
Received on Wednesday, 20 June 2007 20:12:15 UTC