W3C home > Mailing lists > Public > www-tag@w3.org > August 2002

Re: What are Semantics? (Was: Serving generic XML)

From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Mon, 19 Aug 2002 18:04:43 -0400
Message-Id: <p04330110b9871b47cbf1@[]>
To: www-tag@w3.org

At 2:27 PM -0700 8/19/02, Paul Prescod wrote:

>This is an AI-hard problem. Sometimes the heading is the largest text,
>but not always. Sometimes it is at the top, but not always. Sometimes
>the heading starts at the left, but not always. Sometimes the text of
>the document wraps *around* the heading.

I didn't say it was easy, but I do think it's solvable.

>If you think it can be done, do it. You'll make millions on a "Word to
>structured text converter." You'll also prove that the whole "separation
>of presentation from structure" movement was wrong-headed. Why separate
>them explicitly if the computer can do it after the fact? We can ship
>around PDFs and Word documents as "structured text".

I think it can be done. What the market is for it, I don't know. I 
don't think I personally currently have the skills or resources to do 
this, but I've seen things done that are close enough to this 
problem, that I strongly suspect it's possible.

| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
Received on Monday, 19 August 2002 18:20:52 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:33 UTC