- From: Derek Read <derek.read@justsystems.com>
- Date: Tue, 28 Feb 2012 10:05:43 -0800
- To: "Innovimax W3C" <innovimax+w3c@gmail.com>, "George Cristian Bina" <george@oxygenxml.com>
- Cc: "David Carlisle" <davidc@nag.co.uk>, <public-xml-er@w3.org>
- Message-ID: <BECDDDED92C3B949A38F5BC4BF56D21F04B20571@van-mail.jena.local>
For interest sake, given the original problem...
<math><one<two<three</one><__two></tree></math>
XMetaL will "fix" it as follows when opening the document in "well formed" mode (no DTD/XSD provided):
<math><one><two><three/></two></one><__two/></math>
It also displays the following in the "validation log" (note that each of the errors is clickable and takes you to that node so there is context here without it actually being included as text in the error):
* Bad start tag. Expected ">".
* Bad start tag. Expected ">".
* Bad start tag. Expected ">".
* Implied missing end-tag </three>
* Implied missing end-tag </two>
* Ignoring end-tag </tree>
* Implied missing end-tag </__two>
Note that when a DTD or XSD Schema is available the results will be different because a lot of things can be implied from the schema's rules.
Derek Read
Program Manager, XMetaL
From: innovimax@gmail.com [mailto:innovimax@gmail.com] On Behalf Of Innovimax W3C
Sent: Tuesday, February 28, 2012 9:14 AM
To: George Cristian Bina
Cc: David Carlisle; public-xml-er@w3.org Community Group
Subject: Re: David's less simple example
George,
That's not exactly what I got with Oxygen 13.1. How can we double check this ?
Mohamed
On Tue, Feb 28, 2012 at 5:33 PM, George Cristian Bina <george@oxygenxml.com> wrote:
In the oXygen Outline view the fragment
<math><one<two<three</one><two></tree></math>
will be equivalent to
<math><one><two><three></three></two></one><two></two></math>
Formatted for readability that will be:
<math>
<one>
<two>
<three/>
</two>
</one>
<two></two>
</math>
The </tree> tag will be actually ignored, but it still divides eventual text nodes before and after that.
Best Regards,
George
--
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
On 2/28/12 6:09 PM, Innovimax W3C wrote:
David,
It looks like XML5 gives a slightly different result (the name of the
tag contains illegal "<")
http://quuz.org/xml5/play?source=%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E
Mohamed
On Tue, Feb 28, 2012 at 4:49 PM, David Carlisle <davidc@nag.co.uk
<mailto:davidc@nag.co.uk>> wrote:
I think the simple example won't really distinguish systems that "fix
up" markup as they will all pretty much just close the stack of open
elements and give the same result.
To distinguish things a bit it's worth looking at something a bit
less like well formed XML, say
<math><one<two<three</one><__two></tree></math>
Using <math> as an outer element has the advantage that you can test
with an html5 parser (the <math> puts html5 in its "foreign content"
xml-like mode where /> means what it is supposed to mean. One desirable
property of XML-ER would be that it wasn't totally unlike the behaviour
of HTML5 on such content.
Using V.nu's parser you can see the result of parsing the above:
http://livedom.validator.nu/?%__3C!DOCTYPE%20html%3E%0A%__3Cmath%3E%3Cone%3Ctwo%3Cthree%__3C%2Fone%3E%3Ctwo%3E%3C%__2Ftree%3E%3C%2Fmath%3E <http://livedom.validator.nu/?%25__3C!DOCTYPE%20html%3E%0A%25__3Cmath%3E%3Cone%3Ctwo%3Cthree%25__3C%2Fone%3E%3Ctwo%3E%3C%25__2Ftree%3E%3C%2Fmath%3E>
<http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E>
removing the html head and body implied in the html context results in a
parse tree of
<math><__oneU00003CtwoU00003CthreeU0000__3C
one=""><two></two></__oneU00003CtwoU00003CthreeU0000__3C></math>
which is what it is. I don't think it matters too much what the parse
tree is. That is, I don't think it's worth trying to argue about any
meaning implied by the original markup. The important thing is that
html5 specifies a deterministic algorithm that returns a tree. Unless
there is some overwhelming objection, I think XML-ER should return the
same tree. (To be honest I haven't checked what Anne's draft spec would
make of this yet).
David
____________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
____________________________________________________________________________
--
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 9 52 475787 <tel:%2B33%209%2052%20475787>
Fax : +33 1 4356 1746 <tel:%2B33%201%204356%201746>
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 €
--
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 9 52 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 €
Received on Tuesday, 28 February 2012 18:06:34 UTC