W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

Re: David's less simple example

From: George Cristian Bina <george@oxygenxml.com>
Date: Tue, 28 Feb 2012 18:33:57 +0200
Message-ID: <4F4D01F5.40208@oxygenxml.com>
To: Innovimax W3C <innovimax+w3c@gmail.com>
CC: David Carlisle <davidc@nag.co.uk>, "public-xml-er@w3.org Community Group" <public-xml-er@w3.org>
In the oXygen Outline view the fragment

<math><one<two<three</one><two></tree></math>

will be equivalent to

<math><one><two><three></three></two></one><two></two></math>

Formatted for readability that will be:

<math>
   <one>
     <two>
       <three/>
     </two>
   </one>
   <two></two>
</math>

The </tree> tag will be actually ignored, but it still divides eventual 
text nodes before and after that.

Best Regards,
George
--
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com

On 2/28/12 6:09 PM, Innovimax W3C wrote:
> David,
>
> It looks like XML5 gives a slightly different result (the name of the
> tag contains illegal "<")
>
> http://quuz.org/xml5/play?source=%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E
>
> Mohamed
>
> On Tue, Feb 28, 2012 at 4:49 PM, David Carlisle <davidc@nag.co.uk
> <mailto:davidc@nag.co.uk>> wrote:
>
>
>     I think the simple example won't really distinguish systems that "fix
>     up" markup as they will all pretty much just close the stack of open
>     elements and give the same result.
>
>     To distinguish things a bit it's worth looking at something a bit
>     less like well formed XML, say
>
>     <math><one<two<three</one><__two></tree></math>
>
>     Using <math> as an outer element has the advantage that you can test
>     with an html5 parser (the <math> puts html5 in its "foreign content"
>     xml-like mode where /> means what it is supposed to mean. One desirable
>     property of XML-ER would be that it wasn't totally unlike the behaviour
>     of HTML5 on such content.
>
>     Using V.nu's parser you can see the result of parsing the above:
>
>     http://livedom.validator.nu/?%__3C!DOCTYPE%20html%3E%0A%__3Cmath%3E%3Cone%3Ctwo%3Cthree%__3C%2Fone%3E%3Ctwo%3E%3C%__2Ftree%3E%3C%2Fmath%3E
>     <http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E>
>
>     removing the html head and body implied in the html context results in a
>     parse tree of
>
>     <math><__oneU00003CtwoU00003CthreeU0000__3C
>     one=""><two></two></__oneU00003CtwoU00003CthreeU0000__3C></math>
>
>
>     which is what it is. I don't think it matters too much what the parse
>     tree is. That is, I don't think it's worth trying to argue about any
>     meaning implied by the original markup. The important thing is that
>     html5 specifies a deterministic algorithm that returns a tree. Unless
>     there is some overwhelming objection, I think XML-ER should return the
>     same tree. (To be honest I haven't checked what Anne's draft spec would
>     make of this yet).
>
>     David
>
>     ____________________________________________________________________________
>     The Numerical Algorithms Group Ltd is a company registered in England
>     and Wales with company number 1249803. The registered office is:
>     Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
>
>     This e-mail has been scanned for all viruses by Star. The service is
>     powered by MessageLabs.
>     ____________________________________________________________________________
>
>
>
>
> --
> Innovimax SARL
> Consulting, Training & XML Development
> 9, impasse des Orteaux
> 75020 Paris
> Tel : +33 9 52 475787
> Fax : +33 1 4356 1746
> http://www.innovimax.fr
> RCS Paris 488.018.631
> SARL au capital de 10.000 
Received on Tuesday, 28 February 2012 16:34:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 28 February 2012 16:34:29 GMT