Re: Proposed extension of data model

Is it okay that your "microxml parser" will have cases where the data
model generated doesn't produce a valid microxml document?  Is the
plan to have it generate invalid microxml that your parser will parse
to the same data model?

Are you going to create a BNF for the language that your parser
handles?  I know that document ::= [#x0-#x10FFFF]* is a BNF for the
language that your parser handles, but it would be awesome to have one
that suggests what choices your parser makes as it parses.

The reason I ask is that if multiple people write error handling
parsers, it'd be fantastic if they gave the same results for the same
input.  I'd like to add error handling to my haskell parser.

Thanks,
     Chris

On Sun, Nov 18, 2012 at 2:32 AM, James Clark <jjc@jclark.com> wrote:
> On Sun, Nov 18, 2012 at 10:22 AM, John Cowan <cowan@mercury.ccil.org> wrote:
>>
>> James Clark scripsit:
>>
>> > I would like to change the Data Model section of the spec to separate
>> > out
>> > the aspects of the data model that are purely there as a result of the
>> > syntactic constraints of MicroXML.
>>
>> Fair enough.  In that case, I propose that the following text in Section 2
>> of the current draft:
>>
>>     A name item is a non-empty string. The first character in the
>>     string MUST match the production nameStartChar, and any subsequent
>>     characters MUST match the production nameChar. In addition, a name
>>     item occurring as a key in an attributes map MUST not be xmlns.
>>
>>     Any character occurring in the value of an attributes map or as
>>     a member of a content list MUST match the production char.
>>
>> be reduced to the first sentence only.
>
>
> I was thinking of something along the following lines.
>
>> Note that this data model is intended to be useful not just for MicroXML.
>> For example, it can also be used to represent XML or HTML documents.
>> When the data model results from parsing a MicroXML document, it will
>> satisfy the following restrictions:
>> - the first character in a name item will match the production
>> nameStartChar;
>> - the second and subsequent characters in a name item will match the
>> production nameChar
>> - a name item occurring as a key in an attributes map will not be xmlns
>> - a character occurring in the value of an attributes map or as a member
>> of a content list will match the production char.
>
>
> James

Received on Monday, 19 November 2012 09:05:54 UTC