Re: XML observation

Some more comments on this mail prompted by Graham.

At 19:01 03/07/03 -0500, pat hayes wrote:
>Thinking about the issue we have been discussing, it occurs to me that XML 
>has been holding a tiger by the tail and is now getting bitten, and this 
>debate is a symptom of that.
>
>XML started life as a generalized text-markup system, and for that purpose 
>it is wonderful.

Glad you say so. Please note that what we are interested in, and what
I have said many times, is exactly this functionality. We don't what
to make it difficult for people to use XML in other ways, but we don't
want our stuff to be made too difficult, as it currently is.


>But it has been touted and used as something much more that just text 
>markup: it has been announced as a kind of universal solvent for 
>transmitting any kind of structure, a universal general-purpose 
>structure-description system. Unfortunately, several of its features (most 
>notably the restriction of attribute values to strings, cf 
>http://www.waterlang.org/doc/trouble_with_xml.htm) are clearly serious 
>design faults when seen from this more general point of view.

That article is quite confused on several points. If they don't like
XML, they can invent something else, but then they shouldn't use the
name XML.


>But more to the present point, the use of a *language* to describe 
>structures requires us to clearly distinguish the text of the description 
>from the thing - the structure - being described. Making a distinction 
>like this is so second-nature to programmers, mathematicians, logicians 
>and linguists - in fact anyone who uses technical languages professionally 
>- that it takes a while in dealing with XML to realize that XML 
>conspicuously fails to make it, and that in fact that the entire design of 
>XML is predicated on denying it.

No, XML clearly separates text (character data) and structure (markup),
but it turns out that it is much easier to handle these things inline than
separately.


>  XML documents describe structure by *displaying* it, in effect: they 
> *are* the structure they describe. And of course this is entirely 
> appropriate for a markup language: it is the very essence of markup that 
> the markup *labels* the text it is the markup 'of'.
>
>To put the same point another way, markup is inherently indexical: what it 
>means depends on where it is. If you write <title>The Way Things 
>Were</title>, what the enclosing markup says, in effect, is: 'THIS 
>enclosed text is a title'.  The same piece of markup surrounding some 
>other piece of text will implicitly refer to that other piece: its meaning 
>- what it is talking *about* - depends on where in the text the markup 
>occurs. It's location in the text is part of its meaning; and when it is 
>used with no text to mark up, simply as a structural description language, 
>this indexicality is retained in the *descriptive* conventions of the 
>resulting language: so XML as a structural description convention has a 
>built-in confusion between describing structure and displaying or 
>exhibiting it, a built-in ambiguity between being a description and being 
>a kind of diagram or map, a built-in tendency to confuse use and mention.
>
>This is clearly seen in the discussion we have been having. Martin (view X)

Please stop labeling me as 'view X'. The discrepancies between language
handling for plain literals and for XML literals is a problem for us
both in view X and in view G.


>sees a piece of RDF/XML as being a kind of XML text, and the resulting 
>document as *displaying* the RDF structure in the XML. He expects that 
>RDF/XML will satisfy the textual scoping mechanisms which arise naturally 
>in any kind of layout display: in particular, attributes should apply to 
>all of the items which are in the *textual* scope of the XML element.

Sorry, no. This is not a general principle. It is just that XML 1.0
specifies this kind of behavior for xml:lang, and M&S adopted this
behavior for RDF/XML, and even the post-lastcall version of RDF/XML
uses this behavior for plain literals.

And let's not forget that even XML Literals inherit xml namespaces
from the document hierarchy as long as they are visible, and that
black node naming/numbering is global to the whole RDF/XML document.


>That is the XML 'structure as textual display' assumption, of 
>course.  Patrick (view G) sees a structural description language rendered 
>(in a fairly ad-hoc way) into XML syntax; the actual XML document is of 
>relatively little importance: on this view, it is the structure described 
>by the document that defines the significant, meaningful notions of scope 
>and context.  And the RDF/XML conventions clearly isolate the XML 'inside' 
>a parseType-attributed element from the XML surrounding the element, so it 
>is 'obvious' that the lang tags that may be relevant to the outer context 
>do not apply to the inner one.

I don't think there is any problem with Patrick's view if that's the
way he wants to use RDF/XML. We have been informed of a potential
problem in that case, and we have done our best to solve it, by
introducing xml:lang=''. We are not denying 'independent blob' usages
of XML Literals. But we don't think this should be a one-or-the-other-only.
It's easy to make it possible to have both views live peacefully
together. It's not a 'view G' versus 'view X', it's a 'view blob'
vs. a 'view text'.


>In my earlier metaphor, Parick here is the teeth of the tiger. Once XML is 
>sold, and bought, as a general-purpose structural description language, 
>and is used as such by professionals who are familiar with the conventions 
>of such languages, the XML scoping conventions which are inherited from 
>its role as a markup language are no longer appropriate: in fact, they are 
>*ludicrous*: they are like a children's toy in an engineering shop.

If these conventions are inappropriate for such use, they can
easily be fixed. I don't care whether you call xml:lang='' a children's
toy or what else, but it is an easy fix.
I don't think it's appropriate that the 'professionals' you mention
act like bullies and think that only what they do is right, and that
they need to make life difficult for those who have created and
nurtured their tools.


>Expecting professional programmers to conform to descriptive conventions 
>defined by text-markup languages is whistling at the wind.  Programmers 
>have been using more sophisticated scoping conventions for over half a 
>century; not because they didn't know better, but because they *needed* 
>to.  You can't display recursion using indexical markup, for a start.
>
>The XML publicists have bitten off more than they know how to chew. If the 
>result is XML that disobeys the XML 'conventions' and is unreadable by 
>non-programmers, should anyone be surprised?

It's not the XML publicists that have bitten off more than they know how
to chew. The publicists have created something they were able to work with
well. Then others came and used the same thing because they felt it was
useful. The publicists didn't sell this stuff to the others, it just
spread all by itself. Putting the blame on the publicists is totally the
wrong way round. If the 'programmers' don't know the limitations of their
tools, or can't choose appropriate tools, that still does not give them
(or you) any right to suddenly take XML away from those who have created
it and are still using it, in particular if there are quite easy solutions
to let both sides live together, and hopefully over time learn to work
together and take advantage of the interaction and overlap.


Regards,     Martin.

Received on Sunday, 27 July 2003 21:59:49 UTC