W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

Re: XML observation

From: Martin Duerst <duerst@w3.org>
Date: Fri, 04 Jul 2003 10:17:32 -0400
Message-Id: <4.2.0.58.J.20030704100608.037fa678@localhost>
To: pat hayes <phayes@ihmc.us>, w3c-rdfcore-wg@w3.org

I would like to answer in more detail, and hope to do so next week,
but here just a few points.

- Yes, the usage of XML for both textual markup and data is confusing
   to many, even many who are working on the specs.
- Please note that for parseType="Literal", we are actually mostly
   dealing with textual markup, not with data, and it is not only not
   adequate, but plain backwards to raise the 'data issue' to suppress
   the conventions that XML has for textual markup.
- I myself see the fact that XML can deal with both text and data
   as a great potential, but a potential that has to be worked on
   quite a bit to be achieved.
- Whereas the XML conventions for real datatypes in many ways can be
   taken as just a notational convention for abstract concepts such as
   'integer' that RDF treats as abstract concepts, in the case of
   XML literals, we are dealing with marked-up text, and so there the
   abstraction we are dealing with is XML, not just the notation.
   (if RDF would want to create their own abstraction of marked-up
   text, that would be a different thing, but currently, it doesn't)
- XML Schema is confusing in many ways, because it was created by a
   group of people that contained both people with a great deal of
   expertize in textual markup, and people with excellent experience
   in traditional datatypes, but did not spend too much effort to
   combine these things. There is a top-level split into simple and
   complex types, and then two separate worlds. The question 'of which
   simple type is the text content in complex types' was never asked.
   Also, it was not realized that 'string' is very special, because it
   is the simple type for 'everything for which we don't have a type yet'
   and also the type of the actual representation of all the other
   datatypes.


Regards,    Martin.


At 19:01 03/07/03 -0500, pat hayes wrote:
>Thinking about the issue we have been discussing, it occurs to me that XML 
>has been holding a tiger by the tail and is now getting bitten, and this 
>debate is a symptom of that.
>
>XML started life as a generalized text-markup system, and for that purpose 
>it is wonderful. But it has been touted and used as something much more 
>that just text markup: it has been announced as a kind of universal 
>solvent for transmitting any kind of structure, a universal 
>general-purpose structure-description system. Unfortunately, several of 
>its features (most notably the restriction of attribute values to strings, 
>cf http://www.waterlang.org/doc/trouble_with_xml.htm) are clearly serious 
>design faults when seen from this more general point of view.  But more to 
>the present point, the use of a *language* to describe structures requires 
>us to clearly distinguish the text of the description from the thing - the 
>structure - being described. Making a distinction like this is so 
>second-nature to programmers, mathematicians, logicians and linguists - in 
>fact anyone who uses technical languages professionally - that it takes a 
>while in dealing with XML to realize that XML conspicuously fails to make 
>it, and that in fact that the entire design of XML is predicated on 
>denying it. XML documents describe structure by *displaying* it, in 
>effect: they *are* the structure they describe. And of course this is 
>entirely appropriate for a markup language: it is the very essence of 
>markup that the markup *labels* the text it is the markup 'of'.
>
>To put the same point another way, markup is inherently indexical: what it 
>means depends on where it is. If you write <title>The Way Things 
>Were</title>, what the enclosing markup says, in effect, is: 'THIS 
>enclosed text is a title'.  The same piece of markup surrounding some 
>other piece of text will implicitly refer to that other piece: its meaning 
>- what it is talking *about* - depends on where in the text the markup 
>occurs. It's location in the text is part of its meaning; and when it is 
>used with no text to mark up, simply as a structural description language, 
>this indexicality is retained in the *descriptive* conventions of the 
>resulting language: so XML as a structural description convention has a 
>built-in confusion between describing structure and displaying or 
>exhibiting it, a built-in ambiguity between being a description and being 
>a kind of diagram or map, a built-in tendency to confuse use and mention.
>
>This is clearly seen in the discussion we have been having. Martin (view 
>X) sees a piece of RDF/XML as being a kind of XML text, and the resulting 
>document as *displaying* the RDF structure in the XML. He expects that 
>RDF/XML will satisfy the textual scoping mechanisms which arise naturally 
>in any kind of layout display: in particular, attributes should apply to 
>all of the items which are in the *textual* scope of the XML 
>element.  That is the XML 'structure as textual display' assumption, of 
>course.  Patrick (view G) sees a structural description language rendered 
>(in a fairly ad-hoc way) into XML syntax; the actual XML document is of 
>relatively little importance: on this view, it is the structure described 
>by the document that defines the significant, meaningful notions of scope 
>and context.  And the RDF/XML conventions clearly isolate the XML 'inside' 
>a parseType-attributed element from the XML surrounding the element, so it 
>is 'obvious' that the lang tags that may be relevant to the outer context 
>do not apply to the inner one.
>
>In my earlier metaphor, Parick here is the teeth of the tiger. Once XML is 
>sold, and bought, as a general-purpose structural description language, 
>and is used as such by professionals who are familiar with the conventions 
>of such languages, the XML scoping conventions which are inherited from 
>its role as a markup language are no longer appropriate: in fact, they are 
>*ludicrous*: they are like a children's toy in an engineering 
>shop.  Expecting professional programmers to conform to descriptive 
>conventions defined by text-markup languages is whistling at the 
>wind.  Programmers have been using more sophisticated scoping conventions 
>for over half a century; not because they didn't know better, but because 
>they *needed* to.  You can't display recursion using indexical markup, for 
>a start.
>
>The XML publicists have bitten off more than they know how to chew. If the 
>result is XML that disobeys the XML 'conventions' and is unreadable by 
>non-programmers, should anyone be surprised?
>
>Pat Hayes
>--
>---------------------------------------------------------------------
>IHMC    (850)434 8903 or (650)494 3973   home
>40 South Alcaniz St.    (850)202 4416   office
>Pensacola                       (850)202 4440   fax
>FL 32501                        (850)291 0667    cell
>phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 4 July 2003 10:17:41 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:58:42 EDT