Re: data model from Stephen D Green on 2012-10-01 (public-microxml@w3.org from October 2012)

From: Stephen D Green <stephengreenubl@gmail.com>
Date: Mon, 1 Oct 2012 09:26:12 +0100
To: John Cowan <cowan@mercury.ccil.org>
Cc: David Lee <David.Lee@marklogic.com>, Maik Stührenberg <maik.stuehrenberg@uni-bielefeld.de>, "public-microxml@w3.org" <public-microxml@w3.org>
Message-ID: <CAA0AChWdjT2+aZOEUL-2qFi8iadv9nV5D0EqSsXCPfcaNbOwQw@mail.gmail.com>
Small correction:

"microXML - Xyz Data Model specification (where Xyz might be 'Tree'
or 'Hierarchical' or even 'Compound' or just something
like 'Level 1')."

'Compound' should be 'Compounds' of course.

I do like that 'Compounds' data model paper from Cambridge
<http://www.cl.cam.ac.uk/research/security/dendros/compounds-poster.pdf>
I note it is described as a 'next generation' data model so this implies
there might (or will) be further, future data models (especially if you
subscribe to the oft-quoted saying 'the future is longer than the past').
I reckon microXML might be successful enough to still be around when
another data model comes to the fore and might need to survive it
gracefully. We can look back at XML history to learn better how to look
forward, can't we?
----
Stephen D Green



On 1 October 2012 08:46, Stephen D Green <stephengreenubl@gmail.com> wrote:

> Say I want to have a more specialised parser: Perhaps all
> I want is a parser to evaluate a particular XPath or small set
> of XPath expressions and I want a minimally sized compiled
> library to do my parsing just for that purpose and no more.
> It would be nice to have a microXPath expression language
> which has as few as possible ways to represent a single XPath
> expression. Given such a beast, I'd like to be able to create
> ad hoc parsers specialised to a given microXPath expression
> evaluation - highly optimised for performance and compiled
> size. I'd like microXML to allow such a parser to be conformant
> without it having to include unnecessary code. I'd actually
> prefer that conformance not even care about what the abstract
> data model is. Even if I want a conformant parser which can
> evaluate any or all possible microXPath expressions on any
> microXML document, I'd like that parser not to have to conform
> to a particular data model because that might increase the
> parser's cost of development, size and complexity and reduce its
> performance.
>
> Another possible reason to more loosely couple the abstract
> data from the microXML spec (which I regard as most useful
> in its specification of a syntax for microXMl documents) is in
> the matter of conformance testing. I'm not convinced (yet) that
>  you can test conformance of the parser's abstract data model
> since this is likely to be invisible, internal, private rather than
> visible, external, public (to my comparative naivety, I admit).
>
> I'd like to see so-called 'test assertions' for microXML for
> conformance and interoperability testing and in producing
> these, I suspect, it might be found that aspects of the present
> spec's conformance clauses for parsers cannot be expressed
> as testable test assertions (or that such assertions might rely
> on human reading of the code base of a parser and so make
> fully automated testing of conformance based on such test
> assertions too expensive or impracticable).
>
> One suggestion I could make is to call the present spec, if
> the above doesn't get acceptance as enough reason to
> change it to any greater degree, something like microXML
> - Xyz Data Model specification (where Xyz might be 'Tree'
> or 'Hierarchical' or even 'Compound' or just something
> like 'Level 1'). This would 1) indicate that there might follow
> some specs for other data models - and leave room for such
> and 2) mean that a conforming parser need only claim
> conformance to this particular data model. Better, I think,
> might be to add to the conformance section either a
> placeholder note or a conformance clause to cater for more
> specialised microXML parsers (such as my description above
> of a parser optimised for evaluating general or specific XPath
> expressions on a microXML document).
> ----
> Stephen D Green
>
>
>
> On 28 September 2012 16:49, John Cowan <cowan@mercury.ccil.org> wrote:
>
>> Stephen D Green scripsit:
>>
>> > Haven't there already been several different abstract data models
>> > put foward for XML?
>>
>> Yes, but XML is a complex standard and there are lots of things which
>> might
>> be of interest.  The XML Infoset is an attempt to give standard names to
>> some of those things, though there are plenty more which are left out.
>> The PSVI could be used to report DTD information, but nobody does.
>>
>> MicroXML is so trivial that it's not very interesting to provide
>> alternative
>> data models.  You could, for example, leave out attributes, but it's
>> simpler
>> just to ignore them if you don't care about them.  Similarly, you could
>> report
>> on lexical minutiae, but there are only a few: single vs. double quotes
>> and whether character references are used are the only ones I can think
>> of.
>>
>> > Can't we have parsers for MicroXML which support a variety of data
>> > models?
>>
>> In principle, I suppose, but to what purpose?  MicroLark supports push
>> parsing (SAX-style), pull parsing (StAX-style), and tree building, but
>> only one data model, namely that there is one element object for each
>> element in the document, and it contains a name (a string), an attribute
>> map from names to strings, and a sequence of children which are either
>> strings or element objects, all of which must be reported.
>>
>> > I also came across mention of 'compounds' as an alternative
>> > abstract data model for XML - may a parser not implement such if
>> > it wants to claim to be conformant?
>>
>> The MicroXML data model is a simple subset of the compound model.
>> To represent MicroXML in the obvious way, you'd have two kinds of
>> compounds, element compounds and textual compounds.  An element compound
>> has a STRING representing the element name, a TAG marking it as meta,
>> a DIRECTORY mapping attribute values (textual compounds) to attribute
>> values (also textual compounds), a KEY SET containing all the keys in the
>> DIRECTORY, and a LIST consisting of the children.  A textual compound
>> has a STRING representing the text, a TAG marking it as a text string,
>> and an empty DIRECTORY, KEY SET, and LIST.  So a parser reporting these
>> compounds would fully instantiate the MicroXML data model.
>>
>> <http://www.cl.cam.ac.uk/research/security/dendros/compounds-poster.pdf>
>> gives a brief explanation of these terms.
>>
>> --
>> John Cowan  cowan@ccil.org   http://www.ccil.org/~cowan
>> Dievas dave dantis; Dievas duos duonos          --Lithuanian proverb
>> Deus dedit dentes; deus dabit panem             --Latin version thereof
>> Deity donated dentition;
>>   deity'll donate doughnuts                     --English version by Muke
>> Tever
>> God gave gums; God'll give granary              --Version by Mat McVeagh
>>
>
>
Received on Monday, 1 October 2012 08:27:05 UTC