W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

RE: Draft - Fixup or Full XML Parser

From: David Lee <David.Lee@marklogic.com>
Date: Tue, 21 Feb 2012 07:34:09 -0800
To: Anne van Kesteren <annevk@opera.com>, "public-xml-er@w3.org" <public-xml-er@w3.org>
Message-ID: <EB42045A1F00224E93B82E949EC6675E16ADC5F688@EXCHG-BE.marklogic.com>
What I'm getting at is a matter of specification simplicity and separation of concerns.
Certainly one way to implement fixup is by retrofitting/integrating an existing XML parser.  But do we need to put that in the specifications ?
And certainly fixup would require a kind of "Parsing"  but it need not be a full XML parser.  It might be or it might not.
Think of the "Tidy" program.    The input is gobblegook and the output is valid HTML.   The tidy program may be implemented as a kind of HTML parser, but maybe not.
Maybe its a text based parser that never instantiates a true DOM.   As long as it follows the rules its valid.   Is it necessary or even desirable to put in the specifications it has to be a "Parser" rather than the set of rules that must be followed to take the input data model (which I believe has to be text if it allows non-well-formed input) and produces an output data model which is well formed XML (again, without requiring a particular physical representation like a Tree)
What I see is a creeping slope which is mixing implementation with specification.   I think it would be very helpful to not *require* that an implementation of XML ER be a fully compliant XML Parser.  But rather only define that part which would produce well formed XML and then defer  (spec-wise) to an XML parser.
Again, an  implementation could combine both and probably would, but by keeping out of the XML parsing world we don't have to drag in all the XML parseing specs into this one.
And it would leave implementers more freedom, for example to produce a standalone XML ER program or filter which does not claim to be an "XML Parser". 

David Lee
Lead Engineer
MarkLogic Corporation
Phone: +1 650-287-2531
Cell:  +1 812-630-7622

This e-mail and any accompanying attachments are confidential. The information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this e-mail communication by others is strictly prohibited. If you are not the intended recipient, please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.

> -----Original Message-----
> From: Anne van Kesteren [mailto:annevk@opera.com]
> Sent: Tuesday, February 21, 2012 10:19 AM
> To: public-xml-er@w3.org; David Lee
> Subject: Re: Draft - Fixup or Full XML Parser
> On Tue, 21 Feb 2012 14:16:48 +0100, David Lee <David.Lee@marklogic.com>
> wrote:
> > My personal opinion is that the XML ER should be speced as the fixup
> > parser only and not presume that it is a full XML parser.  I think this
> > will save us a lot of work, and provide more value.
> > Comments ?  Objections ? Am I passed left field ?
> How would you envision this "fixup" to work? What you describe sounds like
> 1. determine whether it needs fixup; 2. fixup; 3. parse. The alternate
> approach is just parse, which seems somewhat more straightforward.
> --
> Anne van Kesteren
> http://annevankesteren.nl/

Received on Wednesday, 22 February 2012 12:56:31 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:47:26 UTC