W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

Re: David's less simple example

From: George Cristian Bina <george@oxygenxml.com>
Date: Tue, 28 Feb 2012 21:16:19 +0200
Message-ID: <4F4D2803.1080202@oxygenxml.com>
To: Innovimax W3C <innovimax+w3c@gmail.com>
CC: David Carlisle <davidc@nag.co.uk>, "public-xml-er@w3.org Community Group" <public-xml-er@w3.org>
Hi Mohamed,

The tree structure is in the Outline view. If you click on a node then 
the content of that including the tags is highlights in the text page, 
thus you can see where an element ends (even if the end tag is not there).

So basically the tree structure that you see in the Outline is

math
   one
     two
       three
   two

the content for "math" (including tags) is
   <math><one<two<three</one><two></tree></math>
the content for "one" is
   <one<two<three</one>
the content for "two" is
   <two<three
the content for "three" is
   <three
the content for the second "two" is
   <two></tree>

Best Regards,
George
--
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com

On 2/28/12 7:14 PM, Innovimax W3C wrote:
> George,
>
> That's not exactly what I got with Oxygen 13.1. How can we double check
> this ?
>
> Mohamed
>
> On Tue, Feb 28, 2012 at 5:33 PM, George Cristian Bina
> <george@oxygenxml.com <mailto:george@oxygenxml.com>> wrote:
>
>     In the oXygen Outline view the fragment
>
>     <math><one<two<three</one><__two></tree></math>
>
>     will be equivalent to
>
>     <math><one><two><three></__three></two></one><two></two><__/math>
>
>     Formatted for readability that will be:
>
>     <math>
>     <one>
>     <two>
>     <three/>
>     </two>
>     </one>
>     <two></two>
>     </math>
>
>     The </tree> tag will be actually ignored, but it still divides
>     eventual text nodes before and after that.
>
>     Best Regards,
>     George
>     --
>     George Cristian Bina
>     <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
>     http://www.oxygenxml.com
>
>
>     On 2/28/12 6:09 PM, Innovimax W3C wrote:
>
>         David,
>
>         It looks like XML5 gives a slightly different result (the name
>         of the
>         tag contains illegal "<")
>
>         http://quuz.org/xml5/play?__source=%3Cmath%3E%3Cone%3Ctwo%__3Cthree%3C%2Fone%3E%3Ctwo%3E%__3C%2Ftree%3E%3C%2Fmath%3E
>         <http://quuz.org/xml5/play?source=%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E>
>
>         Mohamed
>
>         On Tue, Feb 28, 2012 at 4:49 PM, David Carlisle
>         <davidc@nag.co.uk <mailto:davidc@nag.co.uk>
>         <mailto:davidc@nag.co.uk <mailto:davidc@nag.co.uk>>> wrote:
>
>
>             I think the simple example won't really distinguish systems
>         that "fix
>             up" markup as they will all pretty much just close the stack
>         of open
>             elements and give the same result.
>
>             To distinguish things a bit it's worth looking at something
>         a bit
>             less like well formed XML, say
>
>         <math><one<two<three</one><____two></tree></math>
>
>             Using <math> as an outer element has the advantage that you
>         can test
>             with an html5 parser (the <math> puts html5 in its "foreign
>         content"
>             xml-like mode where /> means what it is supposed to mean.
>         One desirable
>             property of XML-ER would be that it wasn't totally unlike
>         the behaviour
>             of HTML5 on such content.
>
>             Using V.nu's parser you can see the result of parsing the above:
>
>         http://livedom.validator.nu/?%____3C!DOCTYPE%20html%3E%0A%____3Cmath%3E%3Cone%3Ctwo%3Cthree%____3C%2Fone%3E%3Ctwo%3E%3C%____2Ftree%3E%3C%2Fmath%3E
>         <http://livedom.validator.nu/?%__3C!DOCTYPE%20html%3E%0A%__3Cmath%3E%3Cone%3Ctwo%3Cthree%__3C%2Fone%3E%3Ctwo%3E%3C%__2Ftree%3E%3C%2Fmath%3E>
>         <http://livedom.validator.nu/?__%3C!DOCTYPE%20html%3E%0A%__3Cmath%3E%3Cone%3Ctwo%3Cthree%__3C%2Fone%3E%3Ctwo%3E%3C%__2Ftree%3E%3C%2Fmath%3E
>         <http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E>>
>
>             removing the html head and body implied in the html context
>         results in a
>             parse tree of
>
>         <math><____oneU00003CtwoU00003CthreeU0000____3C
>
>           one=""><two></two></____oneU00003CtwoU00003CthreeU0000____3C></math>
>
>
>             which is what it is. I don't think it matters too much what
>         the parse
>             tree is. That is, I don't think it's worth trying to argue
>         about any
>             meaning implied by the original markup. The important thing
>         is that
>             html5 specifies a deterministic algorithm that returns a
>         tree. Unless
>             there is some overwhelming objection, I think XML-ER should
>         return the
>             same tree. (To be honest I haven't checked what Anne's draft
>         spec would
>             make of this yet).
>
>             David
>
>
>           ________________________________________________________________________________
>
>             The Numerical Algorithms Group Ltd is a company registered
>         in England
>             and Wales with company number 1249803. The registered office is:
>             Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United
>         Kingdom.
>
>             This e-mail has been scanned for all viruses by Star. The
>         service is
>             powered by MessageLabs.
>
>           ________________________________________________________________________________
>
>
>
>
>         --
>         Innovimax SARL
>         Consulting, Training & XML Development
>         9, impasse des Orteaux
>         75020 Paris
>         Tel : +33 9 52 475787 <tel:%2B33%209%2052%20475787>
>         Fax : +33 1 4356 1746 <tel:%2B33%201%204356%201746>
>         http://www.innovimax.fr
>         RCS Paris 488.018.631
>         SARL au capital de 10.000 
>
>
>
>
> --
> Innovimax SARL
> Consulting, Training & XML Development
> 9, impasse des Orteaux
> 75020 Paris
> Tel : +33 9 52 475787
> Fax : +33 1 4356 1746
> http://www.innovimax.fr
> RCS Paris 488.018.631
> SARL au capital de 10.000 
Received on Tuesday, 28 February 2012 19:16:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 28 February 2012 19:16:47 GMT