- From: Stephen D Green <stephengreenubl@gmail.com>
- Date: Thu, 27 Dec 2012 14:31:18 +0000
- To: James Clark <jjc@jclark.com>
- Cc: public-microxml@w3.org
- Message-ID: <CAA0AChXudW_WDR2RtDC427Oc0c0QNuC4DL5egA1WPwkGfWyGGA@mail.gmail.com>
If you are going to have a schema which is like a list of XPath (subset?) expressions: It might be important to somehow ensure that 1) there is some sense of 100% coverage of the MicroXML with the expressions (where needed - unless the schema is partial) 2) there is some way to eliminate duplication - some kind of cannonicity of expressions, say, such that no two expressions say the same thing For 2) it might be worth trying to ensure that there are as few ways as possible (closest as possible to exactly one way) to express any particular constraint. Then duplicate logic will show as duplicate expressions. I would think a strict, perhaps minimal subset of XPath might be a way to achieve this. Guessing it would have a preference for the more succinct shorthand ways to say something. However, it does get complicated to say something simple like count(//form//form)=0 or something like count(//form)>=0, even with the shorthand so an even shorter shorthand might be needed, as has already been implied, e.g. dropping the 'count()' and the leading '//' and perhaps replacing the '=0' or '>=0' with something like the Kleene characters you have in DTDs. If the choice to use Kleene characters like * and + is made then it might be best to combine MicroXPaths with other entities on one line so I suggest separators like those I mentioned recently on XML-Dev for a similar discussion http://lists.xml.org/archives/xml-dev/201212/msg00058.html could be identified: I suggested using the XML-illegal characters like ampersand and less-than so that line-endings can be avoided (in case they are needed as part of the actual expressions). Then you could have something like //form&+<//form//form&- or even, more abbreviated (more implicit assumptions): form&+<form//form&- to say that a form element can be included (anywhere) but cannot have a descendant element named 'form'. (The & separates the MicroXPath-esque expression from the Kleene cardinality character and the < separates one such combined statement from the next.) Having just two (or perhaps three) parts to a statement and having such a limited subset that as near as possible to exactly one way exists to state the same thing thing helps to assure that there can be a clear determination of what consitutes as close as possible to 100% coverage of a MicroXML instance. ---- Stephen D Green On 19 December 2012 04:16, Liam R E Quin <liam@w3.org> wrote: > On Tue, 2012-12-18 at 16:49 +0700, James Clark wrote: > > Here's an idea I was playing around with a while ago. It relates to the > > PossibleChildren property John mentioned. > > > > Imagine a really, really simple schema language that > > > > - uses a non-XML syntax; > > I'm not sure I want to do that. Why should I need a second parser when > I've already got microXML and it's supposed to be perfect for this sort > of thing? If not MicroXML, why not JSON? > > > p !/ p > > > > A p element must not have a p child element. > > If you're really going to invent an expression language, !(p / p) is at > least a little clearer. Or, not(p/p) and use a subset of XPath. > > Or, almost examplotron-style, > > <p><not><p></not></p> > > I know CSS selectors have also been mentioned. But they are complex and > hopelessly non-general and ad-hoc, and tend to hard-wire knowledge of > HTML rather too easily. > > Liam > > -- > Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ > Pictures from old books: http://fromoldbooks.org/ > > >
Received on Thursday, 27 December 2012 14:32:06 UTC