Re: simple case of IRIs for Components in WSDL 2.0 from Bijan Parsia on 2005-10-13 (public-ws-desc-comments@w3.org from October 2005)

From: Bijan Parsia <bparsia@isr.umd.edu>
Date: Thu, 13 Oct 2005 15:52:07 -0400
To: Pat Hayes <phayes@ihmc.us>
Cc: public-ws-desc-comments@w3.org, "Henry S. Thompson" <ht@inf.ed.ac.uk>, public-ws-desc-comments-request@w3.org, Jonathan Marsh <jmarsh@microsoft.com>, Arthur Ryman <ryman@ca.ibm.com>, David Orchard <dorchard@bea.com>
Message-Id: <2c1449153fd65c6f265aeb9c4a4d0d12@isr.umd.edu>
On Oct 13, 2005, at 2:33 PM, Pat Hayes wrote:

>> Either you didn't read, or you didn't understand, that there is a  
>> relevant W3C standard for constructing uri fragments, XPointer. It is  
>> extensible, not limited to XML
>
> See below for comments on that. Your view of XPointer does not seem to  
> match its view of itself. I certainly did not read it as a W3C  
> recommendation for constructing ALL uri fragments.

Let me tone my rhetoric down a bit: there are aspects of the language  
of the XPointer spec which do seem to root it firmly in XML. However,  
my main point is twofold:

	1) it isn't restricted to *denoting* bits of XML, afaict from the spec  
and from practice
	2) it's easy to see how to extend it to non-XML mimetypes

So, the key language <http://www.w3.org/TR/xptr-framework/#language>:

"An XPointer processor takes as input an XML resource and a string to  
be used as a pointer (for example, a fragment identifier, with escaping  
reversed, taken from the URI reference that was used to access the  
resource), attempts to evaluate the pointer with respect to the  
resource, and produces as output an identification of subresources, or  
one or more errors."

I think this is still unfortunately constraining (I would prefer that  
it read as follows:

An XPointer processor takes as input a[snip] resource and a string to  
be used as a pointer (for example, a fragment identifier, with escaping  
reversed, taken from the URI reference that was used to access the  
resource), attempts to evaluate the pointer with respect to the  
resource, and produces as output an identification of [secondary  
]resources, or one or more errors."

In particular, I don't like be restricted to "sub" resources.

As I understand (I'm told perhaps wrongly) fragments, each mimetype  
will define the behavior of such fragments. I would prefer that  
XPointer schemes be able to define their behavior with regard to  
mimetypes.

Given that XPointer schemes are being used to identify non-xml  
resources (e.g., components instead of infoset items), I believe  
reading the language as allowing for "sub"resources to be non infoset  
items is ok.

>> , and reasonable. These are good reasons for using these URIs. They  
>> might not be sufficient reasons to override these other  
>> considerations, but brushing them off isn't helpful.
>
> OK, you are right, I am behind the curve on XPointer. Had I known  
> about XPointer earlier, and understood the stultifying effect it was  
> likely to have, I would have howled earlier and louder.

And I'm not saying that it's a slam dunk. Unusable syntax is worth  
avoiding.

[snip]
>> XPointer is not XML specific, and, in this case, is not being used to  
>> identify bits of XML.
>
> I seem to detect a disconnect here. Your co-authored paper, cited  
> below:
> http://www.cognitiveweb.org/publications/server-side-xpointer-extreme- 
> markup-2004.html
>
> gives as its main reference this:
> [XPTR] XPointer Framework. Ed. Paul Grosso et al. 13 Nov 2002. World  
> Wide Web Consortium. 25 Mar 2003
> http://www.w3.org/TR/xptr-framework/
>
> which I had already checked out, and whose first paragraph reads:
> "This specification defines the XML Pointer Language (XPointer)  
> Framework, an extensible system for XML addressing that underlies  
> additional XPointer scheme specifications. The framework is intended  
> to be used as a basis for fragment identifiers for any resource whose  
> Internet media type is one of text/xml, application/xml,  
> text/xml-external-parsed-entity, or  
> application/xml-external-parsed-entity. Other XML-based media types  
> are also encouraged to use this framework in defining their own  
> fragment identifier languages."
>
> Is it just me, or do I detect a certain, shall I say, XML orientation  
> in that paragraph?  The same document later gives a helpful  
> definition:
>
> "[Definition: XPointer processor ]
> A software component that identifies subresources of an XML resource  
> by applying a pointer to it. This specification defines the behavior  
> of XPointer processors."
>
> I draw your attention to the ninth word. Now, my point is: why should  
> it be that a standard designed for a particular purpose, should be  
> used to decide the design of URIs which may be used for an entirely  
> different purpose?

The spec is annoyingly inconclusive, e.g.,:

"""[Definition: application ]
A software component that incorporates or uses an XPointer processor  
because it needs to access XML subresources. The occurrence and usage  
of XPointers, and the behavior to be applied to resources and  
subresources obtained by processing those XPointers, are governed by  
the definition of each application's corresponding data format (which  
could be XML-based or non-XML-based). For example, HTML [HTML] Web  
browsers and XInclude processors are applications that might use  
XPointer processors."""

XML resource is never defined, nor is subresoruce or XML subresource.

And the second paragraph of the intro reads:

"""This specification does not constrain the types of applications that  
utilize URI references to XML resources, nor does it constrain or  
dictate the behavior of those applications once they locate the desired  
information in those resources."""

So, I think it's legit to parse the xml to a non infoset representation  
(e.g., RDF Graph) and apply your identification algorithm to the RDF  
graph. In fact, there may be no corresponding intelligible fragment of  
XML that corresponds to it.

[snip]
>> XPointers are extensible, composable, and have some other good  
>> features. They are, like URIs, butt ugly, for the most part.
>
> Ugliness I do not care about. Breaking other parsers in widespread use  
> I do care about. I also rather deplore the idea of URI syntax being  
> standardized in such a draconian way for a single, limited, purpose.

I don't think it's particularly limited *purpose* and having a  
relatively standard syntax (and, of course, you can choose not to use  
XPointer) with extensibility allows for useful interop. For example, I  
*do* think having a standard way to identify a subgraph of an RDF graph  
is v. useful.

> But OK, I give up. I wish XPointer had not used up what are probably  
> the two most useful structure-denoting characters ever invented,

(Whitespace wins, I think :))

>  but it is too late to fix that lousy decision (which is like deciding  
> to require that all paper only be used to wipe asses, since paper is  
> so good for that purpose.) And more generally there seems to be no way  
> to prevent IRI syntax conventions from absorbing unlimited amounts of  
> the Unicode space, so the rest of us will have to learn to wrap them  
> in syntactic cling-wrap everywhere, or add yet another layer of  
> character escaping. Sigh. Now I have to go back and explain to a  
> different constituency why my bland confidence that URI/IRI syntax was  
> harmless has to be somewhat qualified.

Sorry.

I do think you can say, "mostly harmless". You do need, always, to have  
an escaping mechanism for full uris and just hope that the incidence of  
icky ones is relatively low. Similar problems crop up with  
proliferations of diversely prefixed URIs, as you can get comically  
large sets of prefix mappings. (or ones that only apply for one uri).

Cheers,
Bijan.
Received on Thursday, 13 October 2005 19:52:22 UTC