Re: Error handling in XML from Digitome Ltd. on 1997-04-19 (w3c-sgml-wg@w3.org from April 1997)

From: Digitome Ltd. <digitome@iol.ie>
Date: Sat, 19 Apr 1997 15:06:55 +0100
To: w3c-sgml-wg@w3.org
Message-Id: <199704191430.PAA23581@mail.iol.ie>

[Peter Murray-Rust]
>
>But people are already starting to think of robotic applications of XML.  Here
>we can see distributed servers with different components.  If a user agent
>(= robot) gets a document (=data to be processed) and this is not WF, then
>it cannot reliably do anything further with that information. 

Does that not depend on the robot? I mean, if I want a robot to trawl the
net for
documents older than 1 year with < 1000 bytes containing the 
phrase "love poem" the robot does not need to concern itself with WF.

> Guessing
>what the author really meant is a recipe for disaster - we've all seen that
with
>HTML.
I  dunno. People seem to want software that guesses. See my previous post re
graphing spreadsheet data!

> Remember that we are already designing into XML the ability to carry
>large collections of links which have to be precise.  These will (I assume) be
>routinely processed by machines, not humans. 

Machines that do this sort of stuff should reject non WF docs. This is fine.
The problem
I see is if this is mandated in the XML spec. I cannot see how this can be
enforced
anyway. No-one can stop my Perl scripts from forging on despite WF errors!

I can see a lot of sense in mandating that assumptions made by user agents
 should be available to users however.

>I can see the value of trying to recover document content *for human 
>consumption*, but not for machine consumption.
Again I think it depends on the task the machine is performing.

>
>[I hope it didn't come across that I have anything other than total admiration
>for James Clark's software, as I have stated before.  Indeed I couldn't be 
>using SGML if it didn't exist.]

Not at all. Like you I probably would not have got my head around SGML if it
were not
for James' tools. I was simply making the point that the likes of

    nsgmls foo.sgm | grep -c "^(BAR$"

can be a useful thing to do even if foo.sgm markup contains errors.

Sean

Received on Saturday, 19 April 1997 10:30:26 UTC