Re: A really micro schema language from Stephen D Green on 2012-12-27 (public-microxml@w3.org from December 2012)

From: Stephen D Green <stephengreenubl@gmail.com>
Date: Thu, 27 Dec 2012 14:47:44 +0000
To: James Clark <jjc@jclark.com>
Cc: public-microxml@w3.org
Message-ID: <CAA0AChWpT9JH72U8qSBLybGxN1G-_b2rr5zaKpm=x6oxKAhLmA@mail.gmail.com>
Note, in the XML-Dev list posting, I did suggest how
using these illegal characters as separators can allow
the scheme language to be extended to actually provide
a very simple yet powerful data model. Of course,
including Kleene-like characters for cardinality means
the data model needs a rethink from that suggested.
Basically the data model combines the MicroXPath
expression with the value of the text in the instance.
A rethink might entail something like allowing the
XPath to include the value like this, say.

for <foo><bar>text</bar></foo>

maybe write the data model as:

/foo>/foo/bar/text()&text

and the schema as:

/foo>/foo/bar&+


A thought: This makes a schema and data model hard
to tell apart. But then again, are they conceptually that
different? A data model could be seen as an extremely
strict schema for which there is only one valid instance!


Another thought: What about datatypes?? (Another
separator needed?)


----
Stephen D Green


On 27 December 2012 14:31, Stephen D Green <stephengreenubl@gmail.com>wrote:

> If you are going to have a schema which is like a list
> of XPath (subset?) expressions: It might be important
> to somehow ensure that
>
> 1) there is some sense of 100% coverage of the MicroXML
> with the expressions (where needed - unless the schema
> is partial)
> 2) there is some way to eliminate duplication - some kind
> of cannonicity of expressions, say, such that no two
> expressions say the same thing
>
> For 2) it might be worth trying to ensure that there are as
> few ways as possible (closest as possible to exactly one
> way) to express any particular constraint. Then duplicate
> logic will show as duplicate expressions.
>
> I would think a strict, perhaps minimal subset of XPath
> might be a way to achieve this. Guessing it would have
> a preference for the more succinct shorthand ways to
> say something. However, it does get complicated to say
> something simple like count(//form//form)=0
> or something like count(//form)>=0, even with the shorthand
> so an even shorter shorthand might be needed, as has
> already been implied, e.g. dropping the 'count()' and the
> leading '//' and perhaps replacing the '=0' or '>=0' with
> something like the Kleene characters you have in DTDs.
> If the choice to use Kleene characters like * and + is made
> then it might be best to combine MicroXPaths with other
> entities on one line so I suggest separators like those I
> mentioned recently on XML-Dev for a similar discussion
> http://lists.xml.org/archives/xml-dev/201212/msg00058.html
> could be identified: I suggested using the XML-illegal
> characters like ampersand and less-than so that line-endings
> can be avoided (in case they are needed as part of the
> actual expressions). Then you could have something like
>
> //form&+<//form//form&-
>
> or even, more abbreviated (more implicit assumptions):
>
> form&+<form//form&-
>
> to say that a form element can be included (anywhere)
> but cannot have a descendant element named 'form'.
> (The & separates the MicroXPath-esque expression from
> the Kleene cardinality character and the < separates one
> such combined statement from the next.)
>
> Having just two (or perhaps three) parts to a statement
> and having such a limited subset that as near as possible to
> exactly one way exists to state the same thing thing helps
> to assure that there can be a clear determination of what
> consitutes as close as possible to 100% coverage of
> a MicroXML instance.
>
> ----
> Stephen D Green
>
>
> On 19 December 2012 04:16, Liam R E Quin <liam@w3.org> wrote:
>
>> On Tue, 2012-12-18 at 16:49 +0700, James Clark wrote:
>> > Here's an idea I was playing around with a while ago.  It relates to the
>> > PossibleChildren property John mentioned.
>> >
>> > Imagine a really, really simple schema language that
>> >
>> > - uses a non-XML syntax;
>>
>> I'm not sure I want to do that. Why should I need a second parser when
>> I've already got microXML and it's supposed to be perfect for this sort
>> of thing? If not MicroXML, why not JSON?
>>
>> >  p !/ p
>> >
>> > A p element must not have a p child element.
>>
>> If you're really going to invent an expression language, !(p / p) is at
>> least a little clearer. Or, not(p/p) and use a subset of XPath.
>>
>> Or, almost examplotron-style,
>>
>>   <p><not><p></not></p>
>>
>> I know CSS selectors have also been mentioned. But they are complex and
>> hopelessly non-general and ad-hoc, and tend to hard-wire knowledge of
>> HTML rather too easily.
>>
>> Liam
>>
>> --
>> Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
>> Pictures from old books: http://fromoldbooks.org/
>>
>>
>>
>
Received on Thursday, 27 December 2012 14:48:32 UTC