Re: What are Semantics? (Was: Serving generic XML) from Elliotte Rusty Harold on 2002-08-19 (www-tag@w3.org from August 2002)

From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Mon, 19 Aug 2002 16:47:39 -0400
To: www-tag@w3.org, www-style@w3.org
Message-Id: <p04330108b9870611d03e@[192.168.254.4]>
At 1:29 PM -0700 8/19/02, Kynn Bartlett wrote:

>Well, Elliotte seems to be quite new to the idea of XML for some
>reason, which makes you wonder just how big his nutshells are.  I read
>his post, and for some reason he is ranking _all_ XML as "MOST", which
>only holds true if the UA has knowledge of the XML document's meaning.
>Without that, you're talking about arbitrary XML WITH NO UA KNOWLEDGE
>OF THE SPECIFIC MARKUP LANGUAGE -- and of course, that's what we have
>been talking about all along.

Right there is where we disagree. I believe semantics is in the 
document even  WITH NO UA KNOWLEDGE OF THE SPECIFIC MARKUP LANGUAGE. 
In fact, I'll go further: there's meaning in the document even if 
there's no user agent. You do not have to have prior agreement or 
understanding in order to usefully process documents.  It may help to 
have such agreement, but it is not a sine qua non.

>If I write something in an arbitrary XML language, why yes, I can
>have intimate knowledge of what it means.  <singer>Madonna</singer>
>is indeed very sensible _TO ME_, the author.  I decide that it is
>very semantically rich.
>
>However, once I send it out to someone else, unless you have the
>Rosetta stone to interpret what it means, it's just markup around
>text.  It is no longer semantically rich, unless I make the
>fundamental XML error of assuming that I can infer appropriate
>meanings from the element names.  Which isn't how XML works, and
>anyone telling you that it's the case really needs to take a step
>back and figure this whole thing out.

No, that is precisely how XML works. You can usefully infer something 
from those names. You cannot infer everything, but that is not at all 
the same thing as inferring nothing. Names have meanings. Those 
meanings depend on context, and may be more or less difficult to 
extract depending on local circumstances, but a normal XML document 
in which the names are all randomly shuffled (such that equal names 
remain equal) is not the same document, and it no longer carries the 
same meaning. Software and people infer meaning from names all the 
time. This process is not error free, but the smarter the software 
and people are, the better the job they do.

It is not a binary, either-or question. There is a continuum of 
meaning and semantics. You seem to believe that anything less than 
precisely defined and agreed-upon-in-advance semantics leads to no 
meaning at all. I, on the other hand, see a fuzzy world where 
semantics depend on context and where meanings shift from one person 
to the next, and yet meaning still exists.

>And it is certainly not a solution to tell people with disabilities
>that they should read source code in order to discern the meaning
>of Web content.
>
>LESS-THAN SINGER GREATER-THAN MADONNA LESS-THAN SLASH SINGER
>GREATER-THAN -- I mean, what the heck?

No, they should get the same style sheet everybody else gets. And 
their software should be smart enough to adapt it for their 
abilities. Of course, eventually they may encounter documents without 
style sheets, just as I and most XML developers do every day, and 
then they too will need to read the source code to figure out what it 
likely means; and if they are unable to do so they may need to 
further communicate with the sender, ask others for help, do some 
research on Google, or something else. But none of this has anything 
to do with their vision or hearing or other senses. It's the exact 
same thing I and other XML developers have to do. It is the way 
information works in the messy, fuzzy world we live in.

>You could probably construct something vaguely like IE's structured
>view of unstyled XML -- the open and close plus/minus thing -- but
>even then you are not conveying meaning, just structure.  The user
>is able to "guess" at the structure if they're lucky, but I can't
>see anyone seriously proposing that the Web needs to consist of
>randomly named nested trees?
>

There you go again. Nobody is suggesting *RANDOMLY* named nested 
trees. But in practice, tree content is not randomly named. It is 
named with more or less care, and with correspondingly more or less 
semantic content. But it is certainly not random.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+
Received on Monday, 19 August 2002 17:07:07 UTC