W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

Re: XML observation

From: pat hayes <phayes@ihmc.us>
Date: Tue, 29 Jul 2003 02:07:46 -0500
Message-Id: <p06001a3abb4bbc028040@[10.0.100.23]>
To: Martin Duerst <duerst@w3.org>
Cc: w3c-i18n-ig@w3.org, msm@w3.org, w3c-rdfcore-wg@w3.org

>Some more comments on this mail prompted by Graham.
>
>At 19:01 03/07/03 -0500, pat hayes wrote:
>>Thinking about the issue we have been discussing, it occurs to me 
>>that XML has been holding a tiger by the tail and is now getting 
>>bitten, and this debate is a symptom of that.
>>
>>XML started life as a generalized text-markup system, and for that 
>>purpose it is wonderful.
>
>Glad you say so. Please note that what we are interested in, and what
>I have said many times, is exactly this functionality.

OK, and I understand that.  But consider: why is RDF - which is not 
even remotely like text in almost any conceivable way (not meant to 
be read by human eyes, has an abstract graph syntax, no natural or 
canonical linearization, cannot be marked up meaningfully, is not a 
character stream, has no notion of scope or textual extent, has no 
relationship to any human language or dialect) being encoded in XML? 
Rather than, say, as a concept map for human readers and Sexpressions 
or similar for machines to transmit to one another?  Why are you and 
we even talking  to one another? Answer: because XML is widely viewed 
as much more than a textual markup system: more as the lingua franca 
of all information exchange; and because it is then forced on every 
application, whether it is appropriate or not, without even indeed 
any analysis of its limitations or suitability.

>We don't what
>to make it difficult for people to use XML in other ways, but we don't
>want our stuff to be made too difficult, as it currently is.

I guess I don't really see why the RDF actions have made your life 
any harder. I can see why you might disapprove, on general 
methodological grounds, of what we are doing: but surely it is not 
going to *affect* your efforts in any serious way, is it?

>>But it has been touted and used as something much more that just 
>>text markup: it has been announced as a kind of universal solvent 
>>for transmitting any kind of structure, a universal general-purpose 
>>structure-description system. Unfortunately, several of its 
>>features (most notably the restriction of attribute values to 
>>strings, cf http://www.waterlang.org/doc/trouble_with_xml.htm) are 
>>clearly serious design faults when seen from this more general 
>>point of view.
>
>That article is quite confused on several points

Maybe, but their central critique is right on the mark, seems to me.

>. If they don't like
>XML, they can invent something else, but then they shouldn't use the
>name XML.

Like the rest of us, they have very little choice.

>>But more to the present point, the use of a *language* to describe 
>>structures requires us to clearly distinguish the text of the 
>>description from the thing - the structure - being described. 
>>Making a distinction like this is so second-nature to programmers, 
>>mathematicians, logicians and linguists - in fact anyone who uses 
>>technical languages professionally - that it takes a while in 
>>dealing with XML to realize that XML conspicuously fails to make 
>>it, and that in fact that the entire design of XML is predicated on 
>>denying it.
>
>No, XML clearly separates text (character data) and structure (markup),
>but it turns out that it is much easier to handle these things inline than
>separately.

You miss my point, I think. It is the 'inline' aspect I am talking 
about. It is NOT easier to handle these things in-line. In fact this 
is precisely the point: for many purposes, it is IMPOSSIBLE to handle 
them naturally in-line. That is probably why people invented things 
like indirect references, pointers and data-structures in the first 
place, in fact. See http://tap.stanford.edu/xemantics.html for 
another analysis of the problem.

>>  XML documents describe structure by *displaying* it, in effect: 
>>they *are* the structure they describe. And of course this is 
>>entirely appropriate for a markup language: it is the very essence 
>>of markup that the markup *labels* the text it is the markup 'of'.
>>
>>To put the same point another way, markup is inherently indexical: 
>>what it means depends on where it is. If you write <title>The Way 
>>Things Were</title>, what the enclosing markup says, in effect, is: 
>>'THIS enclosed text is a title'.  The same piece of markup 
>>surrounding some other piece of text will implicitly refer to that 
>>other piece: its meaning - what it is talking *about* - depends on 
>>where in the text the markup occurs. It's location in the text is 
>>part of its meaning; and when it is used with no text to mark up, 
>>simply as a structural description language, this indexicality is 
>>retained in the *descriptive* conventions of the resulting 
>>language: so XML as a structural description convention has a 
>>built-in confusion between describing structure and displaying or 
>>exhibiting it, a built-in ambiguity between being a description and 
>>being a kind of diagram or map, a built-in tendency to confuse use 
>>and mention.
>>
>>This is clearly seen in the discussion we have been having. Martin (view X)
>
>Please stop labeling me as 'view X'. The discrepancies between language
>handling for plain literals and for XML literals is a problem for us
>both in view X and in view G.

It is not a problem for view G.  On the contrary, the examples that 
you find so compelling, and your arguments about what XML users will 
expect, are completely foreign and unnatural seen from view G. One 
would not expect language to be 'handled' between the RDF and the XML 
of the RDF/XML *at all* on view G: this sounds like a category error, 
like you find the character/octet disparity to be a category error.

>>sees a piece of RDF/XML as being a kind of XML text, and the 
>>resulting document as *displaying* the RDF structure in the XML. He 
>>expects that RDF/XML will satisfy the textual scoping mechanisms 
>>which arise naturally in any kind of layout display: in particular, 
>>attributes should apply to all of the items which are in the 
>>*textual* scope of the XML element.
>
>Sorry, no. This is not a general principle. It is just that XML 1.0
>specifies this kind of behavior for xml:lang, and M&S adopted this
>behavior for RDF/XML, and even the post-lastcall version of RDF/XML
>uses this behavior for plain literals.

It does, indeed, but for some people that is a at best a compromise 
and at worst an error.

>
>And let's not forget that even XML Literals inherit xml namespaces
>from the document hierarchy as long as they are visible, and that
>black node naming/numbering is global to the whole RDF/XML document.
>
>>That is the XML 'structure as textual display' assumption, of 
>>course.  Patrick (view G) sees a structural description language 
>>rendered (in a fairly ad-hoc way) into XML syntax; the actual XML 
>>document is of relatively little importance: on this view, it is 
>>the structure described by the document that defines the 
>>significant, meaningful notions of scope and context.  And the 
>>RDF/XML conventions clearly isolate the XML 'inside' a 
>>parseType-attributed element from the XML surrounding the element, 
>>so it is 'obvious' that the lang tags that may be relevant to the 
>>outer context do not apply to the inner one.
>
>I don't think there is any problem with Patrick's view if that's the
>way he wants to use RDF/XML. We have been informed of a potential
>problem in that case, and we have done our best to solve it, by
>introducing xml:lang=''. We are not denying 'independent blob' usages
>of XML Literals. But we don't think this should be a one-or-the-other-only.
>It's easy to make it possible to have both views live peacefully
>together. It's not a 'view G' versus 'view X', it's a 'view blob'
>vs. a 'view text'.

I wish I could see how to make all sides be peaceably together. That 
hasn't been my experience so far, I have to say :-)

I just checked my files and there are  14 draft versions of the RDF 
datatyping that I alone have written. If you add in the drafts 
written by others that probably amounts to around 30 distinct 
attempts to get some kind - ANY kind - of workable compromise. 
Ultimately, none of these survived the clashing intuitions and 
requirements that were being touted around the table. Its really not 
easy to get solutions that everyone can accept.

>>In my earlier metaphor, Parick here is the teeth of the tiger. Once 
>>XML is sold, and bought, as a general-purpose structural 
>>description language, and is used as such by professionals who are 
>>familiar with the conventions of such languages, the XML scoping 
>>conventions which are inherited from its role as a markup language 
>>are no longer appropriate: in fact, they are *ludicrous*: they are 
>>like a children's toy in an engineering shop.
>
>If these conventions are inappropriate for such use, they can
>easily be fixed. I don't care whether you call xml:lang='' a children's
>toy or what else, but it is an easy fix.

Well, I agree that it can be fixed, though I don't think it will be 
very easy; and I think it will be a huge change to XML, one that will 
require it to be re-thought from the ground up (and not called "XML", 
probably).  Maybe, with an insane amount of luck, it might come out 
looking a bit like LISP.

>I don't think it's appropriate that the 'professionals' you mention
>act like bullies and think that only what they do is right, and that
>they need to make life difficult for those who have created and
>nurtured their tools.

Tools? I feel like a woodworker who is being forced to use plastic 
chisels.  Either side can talk of the other as bullies.  Being 
obliged to treat inheritance textually feels like being bullied to 
many of us.  Being told that we have to be able to handle language 
tags feels like being bullied.  Being obliged to use markup to 
describe structure feels like being bullied.  Not being allowed to 
use familiar, powerful, thoroughly understood ideas and techniques 
that we have been using for many years, because they don't fit into 
XML's barbaric syntax and ridiculously primitive structural model 
feels like being bullied.

>>Expecting professional programmers to conform to descriptive 
>>conventions defined by text-markup languages is whistling at the 
>>wind.  Programmers have been using more sophisticated scoping 
>>conventions for over half a century; not because they didn't know 
>>better, but because they *needed* to.  You can't display recursion 
>>using indexical markup, for a start.
>>
>>The XML publicists have bitten off more than they know how to chew. 
>>If the result is XML that disobeys the XML 'conventions' and is 
>>unreadable by non-programmers, should anyone be surprised?
>
>It's not the XML publicists that have bitten off more than they know how
>to chew. The publicists have created something they were able to work with
>well. Then others came and used the same thing because they felt it was
>useful. The publicists didn't sell this stuff to the others, it just
>spread all by itself. Putting the blame on the publicists is totally the
>wrong way round.

I didnt mean to sound like a conspiracy theorist; but surely you must 
be aware of the almost fanatical 'XML crusade' phenomenon? I have 
been told several times, often in very strong terms, that to 
criticise XML in any way is futile, that XML cannot be resisted, that 
any application which expects to be taken seriously must be based on 
XML, and so on. Why does the RDF charter *require* that RDF use XML 
as an  exchange syntax? Was that the result of careful analysis, or 
was it simply imposed from without, in a kind of reflex reaction to 
the prevailing culture?

>If the 'programmers' don't know the limitations of their
>tools, or can't choose appropriate tools, that still does not give them
>(or you) any right to suddenly take XML away from those who have created
>it and are still using it, in particular if there are quite easy solutions
>to let both sides live together, and hopefully over time learn to work
>together and take advantage of the interaction and overlap.

Nobody is trying to take XML away from anyone. We aren't trying to 
influence the XML standards: we are just struggling to make sense of 
them sufficiently well in order to write our documents to conform to 
them.  God knows, that is work enough.

But to return to the point at issue here, I really havn't seen any 
sign that there is an easy solution to let both sides work together. 
What we have right now is a compromise which nobody likes and nobody 
thinks is elegant; but any movement towards a more straightforward 
solution seems to be sharply more unacceptable to someone. This feels 
to me like patching over the cracks, rather than missing an obvious 
easy solution. And my X/G message was an attempt to suggest where 
some of these cracks might be coming from. I have to say, your 
reaction to it only makes me more convinced that it might be right.

Pat
-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 29 July 2003 03:07:50 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:58:51 EDT