Re: MicroXML parser in JavaScript from James Clark on 2012-09-24 (public-microxml@w3.org from September 2012)

From: James Clark <jjc@jclark.com>
Date: Mon, 24 Sep 2012 17:04:31 +0700
To: John Cowan <cowan@mercury.ccil.org>
Cc: public-microxml@w3.org
Message-ID: <CANz3_EaJYM-Fhckjs-U8qqx6FX2KbWFdeJX_9+ogUEEortyNFg@mail.gmail.com>

On Mon, Sep 24, 2012 at 3:10 PM, John Cowan <cowan@mercury.ccil.org> wrote:

>
> I fed the MicroLark test suite, which is derived from the W3C XML test
> suite, through it.  All the "good" files are parsed and all but two of
> the "bad" files generate errors.  What I don't know yet is if the "good"
> files are parsed correctly.
>

Thanks for doing this. I started working on a test suite for my parser. I'm
including the JSON output for the data model in the test cases.

I think it would be useful for this CG to collaboratively create and
maintain a test suite.  We can use the W3C DVCS system for this (I believe
every CG member has commit rights).  The first step would be to agree on
what format to use.  I think it ought to include a way to check that good
files are parsed correctly.  The alternative to JSON syntax for the data
model would be to define Canonical MicroXML.

I started using the following format.  I'm representing each case a JSON
object.  A "good" test case looks like this:

    {
"id": "0001",
        "comment": "The most basic conforming MicroXML document"
 "source": "<doc></doc>",
"result": ["doc",{},[]]
    }

A "bad" test case omits the "result" member.  The test suite is a JSON
array of such objects.

The two "bad" files in question are both of the form "<?A/>", where
> ? represents a character which is a nameChar but not a nameStartChar.
> The specific characters are #x47 and #x300.

Thanks.  I found a typo in the regex I was using for
nameChar/nameStartChar. (#x47 is Latin Capital Letter G, so I think you
must have meant some other character.)

James

Received on Monday, 24 September 2012 10:05:24 UTC