Re: Intent of ER-XML

On 2/26/2012 6:26 PM, David Lee wrote:
> But how do we solve this magical goal of a 'drop in replacement' for an XML parser ?
> I suggest it is impractical to do this by defining a "Processor".
> But rather by defining the abstract rules (in whatever meta-language or model you like).

Let me support these abstract rules with a concrete example :)

I build a streaming parser/serializer that takes EDI in one side and outputs XML
on the other, and takes XML in and outputs the EDI. It allows XML tools, which
are rich and plentiful and powerful, to manipulate EDI, without ever knowing it's
EDI.

But in order to do this, I've got to support Streams, Readers, Writers, StAX, SAX,
DOM, etc.

The only realistic way I can imagine an XML-ER processor being defined in a useful
and portable way would be kind of like my EDI engines. A list of rules that take
in something that isn't XML (XML-ish, if you will), and turn it into XML.

Now, maybe I implement that by bolding an XML-ER parser to an XML interface, or
maybe I implement it as a pre-parser that takes a stream and delivers a stream
that a bona-fide XML parser inhales. I would imagine that I should be able to use
the XML-ER spec either way, no?

> This still doesn't give us a 'drop in' replacement for an XML Processor, but what it does give us is
>
> A) A consistent statement of how an XML Processor can support "XML-ER" rules
> B) The ability for an implementer to implement such rules however they want,  either by
> retrofitting their existing parser or by writing a new one but still have well defined semantics.
> C) The ability to write a pre-processor which feeds into existing processors.   This is unlikely to be performant ideal, but its extremely useful especially for new specs.  E.g. this is how C++ was originally created.  It was done as a preprocessor for C.  It wasn't particularly efficient but it allowed the language to get into the hands of developers to play with which then encouraged vendors to start writing 'native'  C++ processors.
> It also allowed the early implementations to be vastly simpler as they only had to do the "C++" stuff, and could hand off to existing mature implementations parts that were not C++ specific, like linkers, assemblers, assembly code generation, optimizers etc.
>
> This train of logic leads me to be inclined to not wanting to define a "Processor" either.
> I suggest its vastly more work and less useful and less likely to  be adopted then if we define a more abstract set of rules of how to map a set of input to the well-formed output - in the form of abstract data types. ( no requirement for a serialization format for this 'output' ... rather specifically defined so that a common, but by no means *only*, use case would be an XML Processor/parser would just implement these additional rules within its own framework.).
>
> That doesn't mean that one could not write a "Processor" that implements these rules.  In fact I suspect initial implementations would - but what benefit in chaining ourselves into that requirement ?

+1.


>
>
>
> -----------------------------------------------------------------------------
> David Lee
> Lead Engineer
> MarkLogic Corporation
> dlee@marklogic.com
> Phone: +1 650-287-2531
> Cell:  +1 812-630-7622
> www.marklogic.com
>
> This e-mail and any accompanying attachments are confidential. The information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this e-mail communication by others is strictly prohibited. If you are not the intended recipient, please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.
>
>> -----Original Message-----
>> From: David Carlisle [mailto:davidc@nag.co.uk]
>> Sent: Sunday, February 26, 2012 5:34 PM
>> To: public-xml-er@w3.org
>> Subject: Re: Intent of ER-XML
>>
>> On 26/02/2012 22:21, Noah Mendelsohn wrote:
>>> but fwiw my intuition is that the layering of the specifications
>>> would be better if we first documented the mapping from input to
>>> output, without describing in detail any particular piece of
>>> software that might implement such a mapping.
>>
>> Maybe there is a terminology clash somewhere, as I would say that the
>> current draft meets that description. (If you view DOM references with a
>> sufficiently abstract way). It basically defines a mapping from an input
>> string of unicode characters to an abstract tree representation. It
>> doesn't (or need not) define any API to interact with that tree
>> (although if the tree is described using the DOM there is an obvious
>> mapping to the DOM API).
>>
>> What David Lee was (I think) asking for is something more, a mapping
>> defined from an input string to the string representation of a document
>> matching the productions in the XML 1.x spec that should work somehow
>> without needing a full parser being specified (and, presumably run) on
>> the input stream. I don't have any philosophical objection to such a
>> system (I often edit xml without putting it through a full xml parse
>> with Emacs lisp or perl or whatever) but in this case I can't imagine
>> how it would work as any way I can imagine fixing up the result tree
>> involves finding out what was wrong with the input tree by parsing it.
>>
>> David
>>
>

Received on Monday, 27 February 2012 03:14:06 UTC