- From: Jeni Tennison <jeni@jenitennison.com>
- Date: Tue, 28 Feb 2012 19:27:05 +0000
- To: "public-xml-er@w3.org Community Group" <public-xml-er@w3.org>
- Cc: David Lee <David.Lee@marklogic.com>, Derek Read <derek.read@justsystems.com>, David Carlisle <davidc@nag.co.uk>
Yes, we discussed this briefly in the lunch queue at XML Prague. I think that specifying schema-less error recovery that simply generates a tree should be the first step. It might be that it's then possible to specify a further schema-aware stage that performs better error recovery for nodes that are (somehow) marked with parse errors. Or it might be that it needs a whole other specification. Jeni On 28 Feb 2012, at 19:02, Derek Read wrote: > Strongly agree: "I suspect that a XML version of fixup cannot do nearly > as well as HTML5 without a schema." > > I think if we agree on that then the spec will basically fork at this > juncture: > > 1) When a schema is available the following assumptions and logic can be > followed... > 2) When a document is well-formed (no schema available) the following > /different/ logic applies... > > Derek Read > Program Manager, XMetaL > > > -----Original Message----- > From: David Lee [mailto:David.Lee@marklogic.com] > Sent: Tuesday, February 28, 2012 10:56 AM > To: Jeni Tennison; David Carlisle > Cc: public-xml-er@w3.org Community Group > Subject: RE: David's less simple example > > >> >> I am told that, similarly, MarkLogic (and I assume other ingesters) > perform >> fixup (in their case based on the DTD/schema for the XML). I know that > John >> Cowan has similarly worked on similar algorithms in the past. >> > > I'd like to comment on the above assumption about MarkLogic but probably > shouldn't ... > > But ... > I suggest that a primary reason that HTML5 and Tidy etc. can do as good > a job as they do is precisely because they have the equivalent of a > schema. So they 'know' that say <br> should be <br/> and other such > niceties. I suspect that a XML version of fixup cannot do nearly as > well as HTML5 without a schema. > > ------------------------------------------------------------------------ > ----- > David Lee > Lead Engineer > MarkLogic Corporation > dlee@marklogic.com > Phone: +1 650-287-2531 > Cell: +1 812-630-7622 > www.marklogic.com > > This e-mail and any accompanying attachments are confidential. The > information is intended solely for the use of the individual to whom it > is addressed. Any review, disclosure, copying, distribution, or use of > this e-mail communication by others is strictly prohibited. If you are > not the intended recipient, please notify us immediately by returning > this message to the sender and delete all copies. Thank you for your > cooperation. > > > > > > -- Jeni Tennison http://www.jenitennison.com
Received on Tuesday, 28 February 2012 19:27:30 UTC