RE: simple case of IRIs for Components in WSDL 2.0

>I think there is a misundertanding somewhere (either me or Pat).
>Pat seems to think there is a trailing closing 
>paren without a previous matching open paren. 
>Pat wrote:
>So, a trailing close parenthesis
>without a matching open parenthesis is liable to trigger all kinds of
>lexical errors.
>That is not a problem in the XPointer framework 
>since all the parens are balanced. The WSDL URI 
>have balanced parens.

Sorry, I was being too brief. I realize the 
parens are balanced within the URI itself. But 
consider a parser which is trying to parse some 
notation like LISP or Common Logic Interchange 
Format, in which the parentheses are considered 
to be lexical break characters, and which 
contains embedded URIs as identifiers. Then a URI 
with an adjacent close parenthesis on the right 
will be quite common, as for example in a text 
such as

(cl:text (ex:R ex:a))

If URIs end with closing parentheses, then such a 
parser will be unable to disambiguate, say, the 
URI 'http://ex.badend/(foo)'  from the 
concatenation of the URI 'http://ex.badend/(foo' 
and the closing parenthesis ')'. In practice, 
almost certainly the latter will be what is 
parsed, since the parser will not even seek the 
URI lexical form until the parentheses have been 
detected; but again, many parsers will not detect 
the matching opening parenthesis unless it is the 
first item in a lexical group. So, to adapt the 
above example,

(cl:text (ex:R ex:badend(foo)))

would parse as

<(> <cl:text> <(> <ex:R> <ex:badend(foo> <)> <)> <)>

with a non-matching <close> as the last lexical item.

Of course, there are ways around this: the URIs 
can be enclosed in protective lexical wrappings 
such as double quotes, for example, or users of 
these languages can be required to insert 
whitespace before a lexical-breaking parenthesis. 
But all such ways introduce artificiality and 
awkwardness into what is otherwise a very natural 
and widely used syntactic convention.

Pat Hayes

>Arthur Ryman,
>IBM Software Group, Rational Division
>phone: +1-905-413-3077, TL 969-3077
>assistant: +1-905-413-2411, TL 969-2411
>fax: +1-905-413-4920, TL 969-4920
>mobile: +1-416-939-5063, text:
>Dan Connolly <>
>Sent by:
>10/11/2005 12:58 PM
>Jonathan Marsh <>
>Pat Hayes <>, Bijan Parsia 
><>, David Orchard 
><>, "Henry S. Thompson" 
>RE: simple case of IRIs for Components in WSDL 2.0
>On Wed, 2005-10-05 at 13:52 -0700, Jonathan Marsh wrote:
>>  Thanks for your comment.  The WS Description Working Group tracked
>>  this as a Last Call comment LC335 [1].
>>  [1]
>>   The Working Group was unable to find consensus that the shorter form
>>  of component designators would have all the desired characteristics
>>  that led us to the current design. The issue was therefore closed
>>  without action.
>>  We hope that some of the discussion on this list (particularly using
>>  the best-case scenario rather than the worst-case) alleviates some of
>>  your concerns.
>Some of them.
>>  If we don't hear otherwise within two weeks, we will assume this
>>  satisfies your concern.
>I asked around if some nearby folks were satisfied.
>ok if URIs for SPARQL interface etc. ends with paren?
>I got one clear 'no' answer (below). I'm still thinking about whether
>I find the ... #wsdl.interface(SparqlQuery) syntax acceptable.
>Pat Hayes writes:
>>  The problem is that enclosing parens are (pretty much by definition
>  > of 'paren')  widely used as textual breaking symbols in lexical
>>  analysis. This is true for NL text in almost all human languages,
>>  mathematical texts, any LISP-based programming language text, almost
>>  all logical notations, etc. etc.. So, a trailing close parenthesis
>>  without a matching open parenthesis is liable to trigger all kinds of
>>  lexical errors. It is also, for a similar reason, liable to be
>>  mis-read by a human reader. (I myself find that I see the close
>>  paren, become conscious of the cognitive dissonance, and then have to
>>  visually search for the matching open paren inside the string, which
>>  is not a natural way of reading and intrudes on what ought to be an
>>  unconscious process. This is a psychological hall-mark of a bad
>>  visual design, eg see Don Norman's writings.) And there seems to be
>>  no need to do this brain-damaged thing, since one could adopt a
>>  variety of linking conventions within white-space-free text to
>>  achieve the same intuitive-communication purpose, e.g.
>>  any of which would be readable and lexically harmless.
>>  I would remark more generally that there is a tendency which might be
>>  called glyph-creep, whereby W3C standards implicitly use up symbols
>>  that already have a significant use in the world in general, thereby
>>  forcing people to use unreadable work-arounds. XML's seizure of the
>>  less-than sign and the ampersand is probably the most egregious
>>  example, requiring almost every mathematical text written since 1300
>>  to be re-drafted. Please, do not also take away the parentheses.
>>  Pat
>Dan Connolly, W3C
>D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell

Received on Wednesday, 12 October 2005 22:06:38 UTC