Re: Error handling in XML from Peter Murray-Rust on 1997-04-20 (w3c-sgml-wg@w3.org from April 1997)

From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
Date: Sun, 20 Apr 1997 15:04:58 GMT
To: w3c-sgml-wg@w3.org
Message-Id: <5848@ursus.demon.co.uk>
In message <199704201322.OAA17452@curia.ucc.ie> Peter Flynn writes:
[...]
> 
> Yes, or alternatively we could try to persuade editor authors to get
> XML file creation done up so well that invalid instances are a rarity.
> 
> It should not be impossible to write an editor which would let you
> create elements on the fly and guarantee a well-formed file, complete
> with links to the stylesheet (which acts in effect as a DTD for the editor).
> Think of an AF editor.

Agreed.  This was one of the major problems of HTML - no editor.  Remember of
course that an editor is very restricted unless it can read files written by 
another editor.  That was impossible (until recently) in HTML.  There is no
reason which it shouldn't be possible at the start for XML.  

The language isn't yet finalised - we all agree that.  There is no reason 
why we shouldn't have a simple set of free tools for XML on final launch day.
(Something a bit like the Amaya browser editor for HTML).  Then people have
a zero-cost entry touchstone.  If it barfs on a file, it's saying something.
There will be enough people who listen to make it worthwhile.  Those people
who CARE about information will be supportive of such a tool.

[...]
> We run the same risk with XML. But I think that's inevitable.
> I'll stick my neck out right now and say I suspect that XML
> files will never generally be either well-formed or valid. The vast
> majority of them will contain the same stew as HTML: only a few people
> will be creating valid or WF files. But that shouldn't stop us trying.
> 
> Users want to see real, concrete benefits _immediately_, if they
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> are to turn aside from the Avernus of HTML. 

facilis descensus  ... (actually it's not too bad down there)

Real concrete benefits _immediately_.  This is the key.  The real concrete 
benefits are proper management of information.  The problem is, as we all know,
it's not trivial and it doesn't come at the flick of a switch.  So the 
tension is between doing it properly and doing it now.

In selling XML to an HTML-oriented community, you have to think of the benefits.
We all have different views of what those are, and I think it's very important
to start collating them now.  Here are some at random:
	- XML systems will interoperate (HTML don't)
	- XML allows structured data (HTML doesn't however hard you try)
	- XML allows information to be transmitted precisely (HTML does not)
	- XML allows a robust LINK system to be created (HTML doesn't)
	- automatic systems can be built around XML (but not round HTML).
The downside:
	- XML covers an infinity of approaches, HTML just one
	- ANY of the positive sides of XML above involve someone really
		understanding what's going on.  If you don't understand 
		structured information, you can't create it.  If you don't
		understand hyperlinks you can't create a hyperlinked system
		Of course people will build user-friendly tools to hide this
		from most people, but it won't happen oevrnight.
	- XML requires a modest amount of time to understand what's going on
 
What are the benefits of getting lots of non-WF XML textual documents on the 
WWW with non-WF links?  I'm not very clear, but I don't work in this area.
As I see it you will have to work hard to show people that it's an advantage 
over HTML.

What are the benefits of getting non-textual WF documents on the WWW with
WF and self-consistent link bases?  Personally I think this is enormous.
Maybe it's not glamorous, but it ought to make a major difference to search
engines and indexers (leaving aside the chemistry, interactive orgcharts,
automatic payment, authoring for technical publications, etc. etc.)


> 
> >SGML/XMLers have, quite understandably a number of gripes with html but lets
> >face facts. The thing is a runaway success. 
> 
> Absolutely. But it's not HTML which is the runaway success, it's just
> the interface.

IMO an equally important step was httpd, which gave individuals the chance to 
publish without having understand client-server technology or get permission
from their IT department.  XML can bring publishing very close to the authors,
and indeed that's one of my motivations.  It's tragic to see the process:
	- scientist collects data *digitally* from a machine
	- scientist *types* it manually into spreadsheet or whatever
	- Journal editor requests it on *paper*
	- scientist prints it out
	- publisher *rekeys* in some other format and then throws away the
		disk [sic]
	- data are redrawn as a graph (i.e. lines of *carbon on cellulose*)
	- reader *manually* measures data off imperfect copy (ruler, xv or 
		whatever)
	- reader *types* data into reader's system 

This *must* be a major market for XML (substitute medic, architectect, 
stockbroker, etc. for scientist).  It may not be the largest, but it must be
BIG :-)  And I would hope that in that large market, percision will matter
from a business point of view...

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
Received on Sunday, 20 April 1997 10:51:33 UTC