Re: XHTML2: Proposal for total separation of semantics from structure from Sjoerd Visscher on 2005-08-25 (www-html@w3.org from August 2005)

From: Sjoerd Visscher <sjoerd@w3future.com>
Date: Fri, 26 Aug 2005 00:22:05 +0200
To: www-html@w3.org
Message-ID: <430E448D.5050105@w3future.com>

Junk Account wrote:
> As far as "structure" goes, this means simply an enumeration of the
> parts, in such a way as to form e hierarchy. It does not say anything
> about what each of those parts might actually mean.
> It does not say if it is a paragraph, a piece of code, a lyric, the
> abstract of a scientific paper, a movie, a picture, or whatever else.
> It is simply a hierarchy. A tree.
> "What each part means" is the semantics.

I've always considered titles, paragraphs, tables and lists to be 
structure. They give the content no meaning. Structure *is* semantics, 
but only up to a certain level. Structure is all the meaning which is 
usefull without actually knowing what the text means. F.e. I can create 
a ToC of a properly structured Japanese document, without knowing any 
Japanese.

> The per-capita income of Kenya for 1988 is probably there, somewhere in the net.
> Can you get it in just one search?

Actually, almost. I first tried "The per-capita income of Kenya for 
1988", but that got me only documents with recent information, so I 
added "history". The third page of that query tells me that it is 370 
current US Dollars.

> Even if you got more or less decent results....could you (or even a
> search engine, programatically) extract just that microcontent? Or
> would I need to load and read whole pages looking for the relevant two
> lines in each?

Apparently Google can, again almost. If your search for "The per-capita 
income of Kenya", Google responds with "Kenya — GDP - Per Capita: $ 1,100".

> If microcontent is not programatically extractable, furthermore, then
> I'd have to do some screen scrapping to be able to reuse that content.

Which is probably what Google does. They are doing something special 
with the CIA World Factbook.

> Shouldn't we be providing ways to hook extensible meaning at all
> levels (including the elemental one), in order to facilitate such a
> thing?
> 
> Are we providing extensible means to mark up microcontent semantically?
>
> To add semantic functionality to the web probably requires whole
> lenguages, with concepts as "is-part-of", inheritance, and even more
> complex relationships. And being such a vast thing, likely requires
> extensbility inherently.
> If you ask me, I'd separate. In advance.

I think XHTML 2.0 already does a very good job, with the Metainformation 
and Role Modules. It just needs to take the extra step, and remove the 
old semantics-only elements (address, code, quote, kdb, etc)

-- 
Sjoerd Visscher
http://w3future.com/weblog/

Received on Thursday, 25 August 2005 22:22:22 UTC