- From: Derek Read <derek.read@justsystems.com>
- Date: Tue, 28 Feb 2012 10:05:43 -0800
- To: "Innovimax W3C" <innovimax+w3c@gmail.com>, "George Cristian Bina" <george@oxygenxml.com>
- Cc: "David Carlisle" <davidc@nag.co.uk>, <public-xml-er@w3.org>
- Message-ID: <BECDDDED92C3B949A38F5BC4BF56D21F04B20571@van-mail.jena.local>
For interest sake, given the original problem... <math><one<two<three</one><__two></tree></math> XMetaL will "fix" it as follows when opening the document in "well formed" mode (no DTD/XSD provided): <math><one><two><three/></two></one><__two/></math> It also displays the following in the "validation log" (note that each of the errors is clickable and takes you to that node so there is context here without it actually being included as text in the error): * Bad start tag. Expected ">". * Bad start tag. Expected ">". * Bad start tag. Expected ">". * Implied missing end-tag </three> * Implied missing end-tag </two> * Ignoring end-tag </tree> * Implied missing end-tag </__two> Note that when a DTD or XSD Schema is available the results will be different because a lot of things can be implied from the schema's rules. Derek Read Program Manager, XMetaL From: innovimax@gmail.com [mailto:innovimax@gmail.com] On Behalf Of Innovimax W3C Sent: Tuesday, February 28, 2012 9:14 AM To: George Cristian Bina Cc: David Carlisle; public-xml-er@w3.org Community Group Subject: Re: David's less simple example George, That's not exactly what I got with Oxygen 13.1. How can we double check this ? Mohamed On Tue, Feb 28, 2012 at 5:33 PM, George Cristian Bina <george@oxygenxml.com> wrote: In the oXygen Outline view the fragment <math><one<two<three</one><two></tree></math> will be equivalent to <math><one><two><three></three></two></one><two></two></math> Formatted for readability that will be: <math> <one> <two> <three/> </two> </one> <two></two> </math> The </tree> tag will be actually ignored, but it still divides eventual text nodes before and after that. Best Regards, George -- George Cristian Bina <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com On 2/28/12 6:09 PM, Innovimax W3C wrote: David, It looks like XML5 gives a slightly different result (the name of the tag contains illegal "<") http://quuz.org/xml5/play?source=%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E Mohamed On Tue, Feb 28, 2012 at 4:49 PM, David Carlisle <davidc@nag.co.uk <mailto:davidc@nag.co.uk>> wrote: I think the simple example won't really distinguish systems that "fix up" markup as they will all pretty much just close the stack of open elements and give the same result. To distinguish things a bit it's worth looking at something a bit less like well formed XML, say <math><one<two<three</one><__two></tree></math> Using <math> as an outer element has the advantage that you can test with an html5 parser (the <math> puts html5 in its "foreign content" xml-like mode where /> means what it is supposed to mean. One desirable property of XML-ER would be that it wasn't totally unlike the behaviour of HTML5 on such content. Using V.nu's parser you can see the result of parsing the above: http://livedom.validator.nu/?%__3C!DOCTYPE%20html%3E%0A%__3Cmath%3E%3Cone%3Ctwo%3Cthree%__3C%2Fone%3E%3Ctwo%3E%3C%__2Ftree%3E%3C%2Fmath%3E <http://livedom.validator.nu/?%25__3C!DOCTYPE%20html%3E%0A%25__3Cmath%3E%3Cone%3Ctwo%3Cthree%25__3C%2Fone%3E%3Ctwo%3E%3C%25__2Ftree%3E%3C%2Fmath%3E> <http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E> removing the html head and body implied in the html context results in a parse tree of <math><__oneU00003CtwoU00003CthreeU0000__3C one=""><two></two></__oneU00003CtwoU00003CthreeU0000__3C></math> which is what it is. I don't think it matters too much what the parse tree is. That is, I don't think it's worth trying to argue about any meaning implied by the original markup. The important thing is that html5 specifies a deterministic algorithm that returns a tree. Unless there is some overwhelming objection, I think XML-ER should return the same tree. (To be honest I haven't checked what Anne's draft spec would make of this yet). David ____________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ____________________________________________________________________________ -- Innovimax SARL Consulting, Training & XML Development 9, impasse des Orteaux 75020 Paris Tel : +33 9 52 475787 <tel:%2B33%209%2052%20475787> Fax : +33 1 4356 1746 <tel:%2B33%201%204356%201746> http://www.innovimax.fr RCS Paris 488.018.631 SARL au capital de 10.000 € -- Innovimax SARL Consulting, Training & XML Development 9, impasse des Orteaux 75020 Paris Tel : +33 9 52 475787 Fax : +33 1 4356 1746 http://www.innovimax.fr RCS Paris 488.018.631 SARL au capital de 10.000 €
Received on Tuesday, 28 February 2012 18:06:34 UTC