W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

Re: error recovery

From: David Carlisle <davidc@nag.co.uk>
Date: Sun, 19 Feb 2012 19:01:52 +0000
Message-ID: <4F414720.2070808@nag.co.uk>
CC: W3C XML-ER Community Group <public-xml-er@w3.org>
On 19/02/2012 18:13, Liam R E Quin wrote:
> It's this requirement I want
> to see taken account of, and not forgotten, of course.  I'd like the Web
> browser to be able to tell the user, "there was an "a" element on line
> 96 with no end tag, and an end-tag for a "b" element on line 4015 with
> no start tag" or something like that.


I suspect that requirement may be too hard.

Anne mentioned that his current draft, by which I assume he means

http://code.google.com/p/xml5/source/browse/trunk/specification/Overview.html

flags some things as (non-fatal)  parse errors, which it does, which 
seems reasonable, but the places flagged as errors are not necessarily 
the same places as places where xml 1.x parser would flag an error.

for example unquoted attributes

<foo a=1>

  are, as far as I can see, parsed without error (attribute value 
unquoted state) and this seems reasonable to me, It's a useful feature 
of xml parsing that applications can process the data without knowing 
whether " or ' is used for attributes, if xml-er allows unquoted 
attributes as well then for some uses at least it would presumably be 
just as useful that the application didn't know what syntactic form was 
used.

If you mandate the unquoted attribute be reported as such then either we 
need to spec an API such that such details can be reported as well as 
returning the tree, or the marking would need to be in-line in the 
returned tree, which would I think be unfortunate as it would mean that 
the trees from <foo a=1> <foo a="1"> and <foo a='1'> would not be the same.

Already one can define a sax parser for (say) a CSV text file and 
process the result as if it is XML in most XML pipelines. The XML-ER 
parser could be seen in a similar way, just parsing non-xml in a way 
that exposes an xml-compatible tree, one doesn't have to say how the 
input is not aml, just how it isn't useful to say how the csv file is 
not xml

David
Received on Sunday, 19 February 2012 19:02:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 19 February 2012 19:02:05 GMT