- From: James Clark <jjc@jclark.com>
- Date: Mon, 24 Sep 2012 17:04:31 +0700
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: public-microxml@w3.org
- Message-ID: <CANz3_EaJYM-Fhckjs-U8qqx6FX2KbWFdeJX_9+ogUEEortyNFg@mail.gmail.com>
On Mon, Sep 24, 2012 at 3:10 PM, John Cowan <cowan@mercury.ccil.org> wrote: > > I fed the MicroLark test suite, which is derived from the W3C XML test > suite, through it. All the "good" files are parsed and all but two of > the "bad" files generate errors. What I don't know yet is if the "good" > files are parsed correctly. > Thanks for doing this. I started working on a test suite for my parser. I'm including the JSON output for the data model in the test cases. I think it would be useful for this CG to collaboratively create and maintain a test suite. We can use the W3C DVCS system for this (I believe every CG member has commit rights). The first step would be to agree on what format to use. I think it ought to include a way to check that good files are parsed correctly. The alternative to JSON syntax for the data model would be to define Canonical MicroXML. I started using the following format. I'm representing each case a JSON object. A "good" test case looks like this: { "id": "0001", "comment": "The most basic conforming MicroXML document" "source": "<doc></doc>", "result": ["doc",{},[]] } A "bad" test case omits the "result" member. The test suite is a JSON array of such objects. The two "bad" files in question are both of the form "<?A/>", where > ? represents a character which is a nameChar but not a nameStartChar. > The specific characters are #x47 and #x300. Thanks. I found a typo in the regex I was using for nameChar/nameStartChar. (#x47 is Latin Capital Letter G, so I think you must have meant some other character.) James
Received on Monday, 24 September 2012 10:05:24 UTC