- From: Derek Read <derek.read@justsystems.com>
- Date: Tue, 28 Feb 2012 11:02:11 -0800
- To: "David Lee" <David.Lee@marklogic.com>, "Jeni Tennison" <jeni@jenitennison.com>, "David Carlisle" <davidc@nag.co.uk>
- Cc: <public-xml-er@w3.org>
Strongly agree: "I suspect that a XML version of fixup cannot do nearly as well as HTML5 without a schema." I think if we agree on that then the spec will basically fork at this juncture: 1) When a schema is available the following assumptions and logic can be followed... 2) When a document is well-formed (no schema available) the following /different/ logic applies... Derek Read Program Manager, XMetaL -----Original Message----- From: David Lee [mailto:David.Lee@marklogic.com] Sent: Tuesday, February 28, 2012 10:56 AM To: Jeni Tennison; David Carlisle Cc: public-xml-er@w3.org Community Group Subject: RE: David's less simple example > > I am told that, similarly, MarkLogic (and I assume other ingesters) perform > fixup (in their case based on the DTD/schema for the XML). I know that John > Cowan has similarly worked on similar algorithms in the past. > I'd like to comment on the above assumption about MarkLogic but probably shouldn't ... But ... I suggest that a primary reason that HTML5 and Tidy etc. can do as good a job as they do is precisely because they have the equivalent of a schema. So they 'know' that say <br> should be <br/> and other such niceties. I suspect that a XML version of fixup cannot do nearly as well as HTML5 without a schema. ------------------------------------------------------------------------ ----- David Lee Lead Engineer MarkLogic Corporation dlee@marklogic.com Phone: +1 650-287-2531 Cell: +1 812-630-7622 www.marklogic.com This e-mail and any accompanying attachments are confidential. The information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this e-mail communication by others is strictly prohibited. If you are not the intended recipient, please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.
Received on Tuesday, 28 February 2012 19:03:00 UTC