W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

RE: David's less simple example

From: Derek Read <derek.read@justsystems.com>
Date: Tue, 28 Feb 2012 10:05:43 -0800
Message-ID: <BECDDDED92C3B949A38F5BC4BF56D21F04B20571@van-mail.jena.local>
To: "Innovimax W3C" <innovimax+w3c@gmail.com>, "George Cristian Bina" <george@oxygenxml.com>
Cc: "David Carlisle" <davidc@nag.co.uk>, <public-xml-er@w3.org>
For interest sake, given the original problem...

<math><one<two<three</one><__two></tree></math>

 

XMetaL will "fix" it as follows when opening the document in "well formed" mode (no DTD/XSD provided):

<math><one><two><three/></two></one><__two/></math>

 

It also displays the following in the "validation log" (note that each of the errors is clickable and takes you to that node so there is context here without it actually being included as text in the error):

* Bad start tag. Expected ">".

* Bad start tag. Expected ">".

* Bad start tag. Expected ">".

* Implied missing end-tag </three>

* Implied missing end-tag </two>

* Ignoring end-tag </tree>

* Implied missing end-tag </__two>

 

Note that when a DTD or XSD Schema is available the results will be different because a lot of things can be implied from the schema's rules.

 

Derek Read

Program Manager, XMetaL

 

 

From: innovimax@gmail.com [mailto:innovimax@gmail.com] On Behalf Of Innovimax W3C
Sent: Tuesday, February 28, 2012 9:14 AM
To: George Cristian Bina
Cc: David Carlisle; public-xml-er@w3.org Community Group
Subject: Re: David's less simple example

 

George,

 

That's not exactly what I got with Oxygen 13.1. How can we double check this ?

 

Mohamed

On Tue, Feb 28, 2012 at 5:33 PM, George Cristian Bina <george@oxygenxml.com> wrote:

In the oXygen Outline view the fragment

<math><one<two<three</one><two></tree></math>

will be equivalent to

<math><one><two><three></three></two></one><two></two></math>

Formatted for readability that will be:

<math>
 <one>
   <two>
     <three/>
   </two>
 </one>
 <two></two>
</math>

The </tree> tag will be actually ignored, but it still divides eventual text nodes before and after that.

Best Regards,
George
--
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com




On 2/28/12 6:09 PM, Innovimax W3C wrote:

	David,
	
	It looks like XML5 gives a slightly different result (the name of the
	tag contains illegal "<")
	
	http://quuz.org/xml5/play?source=%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E

	
	Mohamed
	
	On Tue, Feb 28, 2012 at 4:49 PM, David Carlisle <davidc@nag.co.uk
	<mailto:davidc@nag.co.uk>> wrote:
	
	
	   I think the simple example won't really distinguish systems that "fix
	   up" markup as they will all pretty much just close the stack of open
	   elements and give the same result.
	
	   To distinguish things a bit it's worth looking at something a bit
	   less like well formed XML, say
	
	   <math><one<two<three</one><__two></tree></math>
	
	   Using <math> as an outer element has the advantage that you can test
	   with an html5 parser (the <math> puts html5 in its "foreign content"
	   xml-like mode where /> means what it is supposed to mean. One desirable
	   property of XML-ER would be that it wasn't totally unlike the behaviour
	   of HTML5 on such content.
	
	   Using V.nu's parser you can see the result of parsing the above:
	
	   http://livedom.validator.nu/?%__3C!DOCTYPE%20html%3E%0A%__3Cmath%3E%3Cone%3Ctwo%3Cthree%__3C%2Fone%3E%3Ctwo%3E%3C%__2Ftree%3E%3C%2Fmath%3E <http://livedom.validator.nu/?%25__3C!DOCTYPE%20html%3E%0A%25__3Cmath%3E%3Cone%3Ctwo%3Cthree%25__3C%2Fone%3E%3Ctwo%3E%3C%25__2Ftree%3E%3C%2Fmath%3E> 
	   <http://livedom.validator.nu/?%3C!DOCTYPE%20html%3E%0A%3Cmath%3E%3Cone%3Ctwo%3Cthree%3C%2Fone%3E%3Ctwo%3E%3C%2Ftree%3E%3C%2Fmath%3E>
	
	   removing the html head and body implied in the html context results in a
	   parse tree of
	
	   <math><__oneU00003CtwoU00003CthreeU0000__3C
	   one=""><two></two></__oneU00003CtwoU00003CthreeU0000__3C></math>
	
	
	   which is what it is. I don't think it matters too much what the parse
	   tree is. That is, I don't think it's worth trying to argue about any
	   meaning implied by the original markup. The important thing is that
	   html5 specifies a deterministic algorithm that returns a tree. Unless
	   there is some overwhelming objection, I think XML-ER should return the
	   same tree. (To be honest I haven't checked what Anne's draft spec would
	   make of this yet).
	
	   David
	
	   ____________________________________________________________________________

	
	   The Numerical Algorithms Group Ltd is a company registered in England
	   and Wales with company number 1249803. The registered office is:
	   Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
	
	   This e-mail has been scanned for all viruses by Star. The service is
	   powered by MessageLabs.

	   ____________________________________________________________________________
	
	
	
	
	--
	Innovimax SARL
	Consulting, Training & XML Development
	9, impasse des Orteaux
	75020 Paris
	Tel : +33 9 52 475787 <tel:%2B33%209%2052%20475787> 
	Fax : +33 1 4356 1746 <tel:%2B33%201%204356%201746> 
	http://www.innovimax.fr

	RCS Paris 488.018.631
	SARL au capital de 10.000 €





 

-- 
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 9 52 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr

RCS Paris 488.018.631
SARL au capital de 10.000 €

Received on Tuesday, 28 February 2012 18:06:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 28 February 2012 18:06:34 GMT