Re: data model from James Clark on 2012-10-01 (public-microxml@w3.org from October 2012)

From: James Clark <jjc@jclark.com>
Date: Mon, 1 Oct 2012 15:52:26 +0700
To: stephengreenubl@gmail.com
Cc: John Cowan <cowan@mercury.ccil.org>, David Lee <David.Lee@marklogic.com>, Maik Stührenberg <maik.stuehrenberg@uni-bielefeld.de>, "public-microxml@w3.org" <public-microxml@w3.org>
Message-ID: <CANz3_EazHRPDhBuCN6a1pPNOmnUJzApY2DybXpm-9Xafc4=QVA@mail.gmail.com>
You are quite wrong about conformance.  MicroXML's data model is a big help
for conformance testing: it enable conformance testing to be much more
thorough and concrete.

I also remain confident that conformance to the data model does not impose
any unnecessary burden on implementations.

James

On Mon, Oct 1, 2012 at 2:46 PM, Stephen D Green
<stephengreenubl@gmail.com>wrote:

> Say I want to have a more specialised parser: Perhaps all
> I want is a parser to evaluate a particular XPath or small set
> of XPath expressions and I want a minimally sized compiled
> library to do my parsing just for that purpose and no more.
> It would be nice to have a microXPath expression language
> which has as few as possible ways to represent a single XPath
> expression. Given such a beast, I'd like to be able to create
> ad hoc parsers specialised to a given microXPath expression
> evaluation - highly optimised for performance and compiled
> size. I'd like microXML to allow such a parser to be conformant
> without it having to include unnecessary code. I'd actually
> prefer that conformance not even care about what the abstract
> data model is. Even if I want a conformant parser which can
> evaluate any or all possible microXPath expressions on any
> microXML document, I'd like that parser not to have to conform
> to a particular data model because that might increase the
> parser's cost of development, size and complexity and reduce its
> performance.
>
> Another possible reason to more loosely couple the abstract
> data from the microXML spec (which I regard as most useful
> in its specification of a syntax for microXMl documents) is in
> the matter of conformance testing. I'm not convinced (yet) that
>  you can test conformance of the parser's abstract data model
> since this is likely to be invisible, internal, private rather than
> visible, external, public (to my comparative naivety, I admit).
>
> I'd like to see so-called 'test assertions' for microXML for
> conformance and interoperability testing and in producing
> these, I suspect, it might be found that aspects of the present
> spec's conformance clauses for parsers cannot be expressed
> as testable test assertions (or that such assertions might rely
> on human reading of the code base of a parser and so make
> fully automated testing of conformance based on such test
> assertions too expensive or impracticable).
>
> One suggestion I could make is to call the present spec, if
> the above doesn't get acceptance as enough reason to
> change it to any greater degree, something like microXML
> - Xyz Data Model specification (where Xyz might be 'Tree'
> or 'Hierarchical' or even 'Compound' or just something
> like 'Level 1'). This would 1) indicate that there might follow
> some specs for other data models - and leave room for such
> and 2) mean that a conforming parser need only claim
> conformance to this particular data model. Better, I think,
> might be to add to the conformance section either a
> placeholder note or a conformance clause to cater for more
> specialised microXML parsers (such as my description above
> of a parser optimised for evaluating general or specific XPath
> expressions on a microXML document).
> ----
> Stephen D Green
>
>
>
> On 28 September 2012 16:49, John Cowan <cowan@mercury.ccil.org> wrote:
>
>> Stephen D Green scripsit:
>>
>> > Haven't there already been several different abstract data models
>> > put foward for XML?
>>
>> Yes, but XML is a complex standard and there are lots of things which
>> might
>> be of interest.  The XML Infoset is an attempt to give standard names to
>> some of those things, though there are plenty more which are left out.
>> The PSVI could be used to report DTD information, but nobody does.
>>
>> MicroXML is so trivial that it's not very interesting to provide
>> alternative
>> data models.  You could, for example, leave out attributes, but it's
>> simpler
>> just to ignore them if you don't care about them.  Similarly, you could
>> report
>> on lexical minutiae, but there are only a few: single vs. double quotes
>> and whether character references are used are the only ones I can think
>> of.
>>
>> > Can't we have parsers for MicroXML which support a variety of data
>> > models?
>>
>> In principle, I suppose, but to what purpose?  MicroLark supports push
>> parsing (SAX-style), pull parsing (StAX-style), and tree building, but
>> only one data model, namely that there is one element object for each
>> element in the document, and it contains a name (a string), an attribute
>> map from names to strings, and a sequence of children which are either
>> strings or element objects, all of which must be reported.
>>
>> > I also came across mention of 'compounds' as an alternative
>> > abstract data model for XML - may a parser not implement such if
>> > it wants to claim to be conformant?
>>
>> The MicroXML data model is a simple subset of the compound model.
>> To represent MicroXML in the obvious way, you'd have two kinds of
>> compounds, element compounds and textual compounds.  An element compound
>> has a STRING representing the element name, a TAG marking it as meta,
>> a DIRECTORY mapping attribute values (textual compounds) to attribute
>> values (also textual compounds), a KEY SET containing all the keys in the
>> DIRECTORY, and a LIST consisting of the children.  A textual compound
>> has a STRING representing the text, a TAG marking it as a text string,
>> and an empty DIRECTORY, KEY SET, and LIST.  So a parser reporting these
>> compounds would fully instantiate the MicroXML data model.
>>
>> <http://www.cl.cam.ac.uk/research/security/dendros/compounds-poster.pdf>
>> gives a brief explanation of these terms.
>>
>> --
>> John Cowan  cowan@ccil.org   http://www.ccil.org/~cowan
>> Dievas dave dantis; Dievas duos duonos          --Lithuanian proverb
>> Deus dedit dentes; deus dabit panem             --Latin version thereof
>> Deity donated dentition;
>>   deity'll donate doughnuts                     --English version by Muke
>> Tever
>> God gave gums; God'll give granary              --Version by Mat McVeagh
>>
>
>
Received on Monday, 1 October 2012 08:53:15 UTC