- From: Jo Rabin <jrabin@mtld.mobi>
- Date: Mon, 12 Mar 2007 07:05:56 -0400
- To: "Sean Owen" <srowen@google.com>
- Cc: <public-mobileok-checker@w3.org>
Hi Further comments in line Jo > -----Original Message----- > From: Sean Owen [mailto:srowen@google.com] > Sent: 08 March 2007 22:14 > To: Jo Rabin > Cc: public-mobileok-checker@w3.org > Subject: Re: Requirements for mobileOK reference checker > > All sounds good to me. A few comments in line. > > On 3/7/07, Jo Rabin <jrabin@mtld.mobi> wrote: > > > [4.2] Input to the checker will be specified by URI [should we consider > > a literal string as well? given that the checker needs to check most > > external references, would this in fact be useful? Yes, if the tests > > relating to external references are skipped, or if the base uri can be > > supplied] > > I think it would be nice to accept a string (well, really byte > sequence). Without HTTP headers, some tests will definitely fail > though. So one would have to accept HTTP headers. So I start to > question how useful it is. I think the key point is that it would be nice to offer a 'tools' interface. i.e. try to encourage people to submit stuff _before_ committing to their server. I know there are other ways of doing this and I know that this will be far from complete but nonetheless think it could be useful. > > > [4.3] The checker will be written in Java and provide a programmatic > > interface with bindings initially to Java [and SOAP?]. [Are we going to > > write something other than Javadoc by way of documentation/design?] > > Javadoc is good. > I had had in mind that this would be a Java implementation which could > be embedded in, say, Tomcat/Axis to expose them as a SOAP service, but > that that is a separate project. I wonder how much of an obstacle to porting the 'the code is the documentation' approach presents? > > > [4.4] The checker development project will not develop a user interface > > except as necessary for testing it, but the use case of its deployment > > in a human request / response environment should be borne in mind. > > Specifically this should not be seen as a project to create the W3C > > mobileOK checker. > > I agree, but with the understanding that the very next project should > be to make it the new backend of validator.w3.org/mobile > Well, that is up to Dom / W3C isn't it? > > [4.5] The checker will create an intermediate document that makes > > available for inspection all details of retrievals, validation and other > > pre-processing required in order to carry out the tests. The format of > > this intermediate document will be specified separately, and will use > > existing representations [like RDF/HTTP] where possible. > > [and per resolution of 26 Feb from an API perspective this needs also to > > be available as DOM or SAX-wise or as a Java class?] > > I think the results should be available as a DOM (and thus as a > document), and also in a native Java class representation. We need to decide what the primary representation is, I think. I'd prefer this to be documented as a Schema and have a mapping from the schema to Java native class rather than vice versa. > > > [4.5.2] To allow processing of mal-formed and invalid primary input > > documents (those that are the subject of the test, rather than resources > > that are referenced) the pre-processing will provide a 'cleaned up' > > version [whose xml header and Doctype declaration at least, will need to > > have some magic performed to allow inclusion in the middle of the > > document] and that the nature of the clean-up needs to be explicit and > > not implementation dependent [i.e. using Tidy is all very well, but it > > is opaque in its operation; from this pov, perhaps we should look more > > closely at Dom's suggestion of http://home.ccil.org/~cowan/XML/tagsoup/ > > which (I think) operates on the basis of explicit rules which can be > > captured and repeated] > > This is a tough call to me... if you tidy a doc then you are testing > something different than what you really got. You wouldn't want to > pass the cleaned-up doc when the raw one would fail. > > The idea is, I imagine, to fail the raw document but additionally say, > oh, if you cleaned it up a bit here's some more results you could get. > Nice idea. Yes, it should definitely FAIL on hard FAILs. Yes, this is really to answer the point in mobileOK about giving maximum possible info to developer. i.e. to try to prevent them fixing problem 1, rechecking, fixing problem 2, rechecking and so on. It will be imperfect whatever we do, of course. > > I wonder if it makes sense to consider this external -- you're free to > tidy your doc before passing it through if you want. Or consider it an > option -- run the test is lenient mode? I am torn on whether the > complexity and confusion is worth it for documents that already can't > even get their markup right. Per above > > > [4.5.5] HTTP parameters and their values should be recorded in a > > normalised form as well as being recorded in their original form. > > Headers? or why do we care about URI parameters? > if headers what is the normalized form like? I think we should report > and test on the real header value. Sorry, I did not mean parameters. I meant headers, as you correctly inferred. My point is that it makes post-processing easier if you always report Content-Type as that and not as content-type if that is what the server actually returned. Equally, if we are to use HTTP-in-RDF then we'd want to know what had been transformed in order to arrive at the processed RDF representation. > > > [4.7] It must be possible to add tests without recompiling the checker. > > Yes, well I thought the idea is that the implementation should > externalize enough information that external entities can reuse that > information to write more tests. I don't imagine one would extend the > implementation by actually modifying it. > Well that would be one way of meeting the requirement :-) > > [4.8] It must be possible to replace sub-components (such as remote > > validation steps) by configuration option. > > What does this mean, just that there needs to be some configurable > behavior? I agree though want to be careful that a PASS means > something clear -- not "PASS, but if you set this option" but > deifnitely "PASS" PASS is always conditional on the processing you have done. If you use 'ropey-old-validator-that-barfs-on-the-wrong-stuff' then you have a different meaning of PASS than if you use 'industry-standard-and-most-up-to-date-validator'. So I think this is why the validation steps need to be named, reported on and open to configuration. Jo
Received on Monday, 12 March 2007 11:06:14 UTC