W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

Re: David's less simple example (was: Marcos simple sample)

From: Innovimax W3C <innovimax+w3c@gmail.com>
Date: Tue, 28 Feb 2012 17:09:53 +0100
Message-ID: <CAAK2GfFcs78CX5erVr_tLmNBfped2KvCSivQB3uBMP6oAmtOeQ@mail.gmail.com>
To: David Carlisle <davidc@nag.co.uk>
Cc: "public-xml-er@w3.org Community Group" <public-xml-er@w3.org>
David,

It looks like XML5 gives a slightly different result (the name of the tag
contains illegal "<")

http://quuz.org/xml5/play?source=%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E

Mohamed

On Tue, Feb 28, 2012 at 4:49 PM, David Carlisle <davidc@nag.co.uk> wrote:

>
> I think the simple example won't really distinguish systems that "fix
> up" markup as they will all pretty much just close the stack of open
> elements and give the same result.
>
> To distinguish things a bit it's worth looking at something a bit less
> like well formed XML, say
>
> <math><one<two<three</one><**two></tree></math>
>
> Using <math> as an outer element has the advantage that you can test
> with an html5 parser (the <math> puts html5 in its "foreign content"
> xml-like mode where /> means what it is supposed to mean. One desirable
> property of XML-ER would be that it wasn't totally unlike the behaviour
> of HTML5 on such content.
>
> Using V.nu's parser you can see the result of parsing the above:
>
> http://livedom.validator.nu/?%**3C!DOCTYPE%20html%3E%0A%**
> 3Cmath%3E%3Cone%3Ctwo%3Cthree%**3C%2Fone%3E%3Ctwo%3E%3C%**
> 2Ftree%3E%3C%2Fmath%3E<http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E>
>
> removing the html head and body implied in the html context results in a
> parse tree of
>
> <math><**oneU00003CtwoU00003CthreeU0000**3C
> one=""><two></two></**oneU00003CtwoU00003CthreeU0000**3C></math>
>
>
> which is what it is. I don't think it matters too much what the parse
> tree is. That is, I don't think it's worth trying to argue about any
> meaning implied by the original markup. The important thing is that
> html5 specifies a deterministic algorithm that returns a tree. Unless
> there is some overwhelming objection, I think XML-ER should return the
> same tree. (To be honest I haven't checked what Anne's draft spec would
> make of this yet).
>
> David
>
> ______________________________**______________________________**
> ____________
> The Numerical Algorithms Group Ltd is a company registered in England
> and Wales with company number 1249803. The registered office is:
> Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
>
> This e-mail has been scanned for all viruses by Star. The service is
> powered by MessageLabs. ______________________________**
> ______________________________**____________
>



-- 
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 9 52 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 
Received on Tuesday, 28 February 2012 16:10:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 28 February 2012 16:10:22 GMT