W3C home > Mailing lists > Public > public-mobileok-checker@w3.org > March 2007

RE: Requirements for mobileOK reference checker

From: Jo Rabin <jrabin@mtld.mobi>
Date: Tue, 13 Mar 2007 09:48:45 -0400
Message-ID: <815E07C915F39742A29E5587B3A7FA192A08D8D9@lk0-cs0.int.link2exchange.com>
To: "Sean Owen" <srowen@google.com>
Cc: <public-mobileok-checker@w3.org>

> -----Original Message-----
> From: Sean Owen [mailto:srowen@google.com]
> Sent: 13 March 2007 04:30
> To: Jo Rabin
> Cc: public-mobileok-checker@w3.org
> Subject: Re: Requirements for mobileOK reference checker

> > Yes, this is really to answer the point in mobileOK about giving
> > possible info to developer. i.e. to try to prevent them fixing
> > 1, rechecking, fixing problem 2, rechecking and so on. It will be
> > imperfect whatever we do, of course.
> I agree. I suppose I'd like to introduce this functionality as an
> explicit option -- lenient or strict mode with a default to strict? --
> to avoid confusion about what the outcome really is.
Right - that seems fine in principle - but needs to be considered in the
context of where the intermediate document sits in the processing, and
what optionality on executing tests is provided aside from this. 

> > Sorry, I did not mean parameters. I meant headers, as you correctly
> > inferred. My point is that it makes post-processing easier if you
> > report Content-Type as that and not as content-type if that is what
> > server actually returned.
> >
> > Equally, if we are to use HTTP-in-RDF then we'd want to know what
> > been transformed in order to arrive at the processed RDF
> Ah OK. I agree. One should know about the before-and-after form for
> normalizations that may have a non-trivial effect. One way is include
> both the complete original body and headers in the output in some way.
> If the only info we're losing is case in header names (and they aren't
> case sensitive, right?) then maybe including the original HTTP headers
> is unnecessary. Maybe there are other possible normalizations I'm not
> aware of.
I think we should look at this very carefully. 

It seems [RFC2616 4.2] that field names of HTTP headers are not case
sensitive. That is often, though not universally true of field-values. 

I didn't spot anything in HTTP-in-RDF that talks about an approach to
normalisation of field values - clearly it would be extremely useful
when processing the values to know, for example, that "nocache" has been
normalised to lower case. It would also be convenient to know that white
space has been normalised.

>From the point of view of making the intermediate document useful to
extension processors, I think that the field values should be parsed
into their components, where they have structure.

For example, I'm thinking that it would be nice if the pre-processing
took something like:

Accept: text/html;q=1.0, */*;q=0.01

and parsed it into something along the lines of

<header name="accept"><imt q="1.0">text/html</imt><imt

And again

Content-Type: application/xhtml+xml;charset=UTF-8

And parsed it into 

<header name="content-type"><imt

I know that is not HTTP-in-RDF - can it be extended to do that?

> > > Yes, well I thought the idea is that the implementation should
> > > externalize enough information that external entities can reuse
> > > information to write more tests. I don't imagine one would extend
> > > implementation by actually modifying it.
> > >
> > Well that would be one way of meeting the requirement :-)
> Requirement met then, check.

Er, not so fast. Meeting the requirement is necessary but not

> > > What does this mean, just that there needs to be some configurable
> > > behavior? I agree though want to be careful that a PASS means
> > > something clear -- not "PASS, but if you set this option" but
> > > deifnitely "PASS"
> >
> > PASS is always conditional on the processing you have done. If you
> > 'ropey-old-validator-that-barfs-on-the-wrong-stuff' then you have a
> > different meaning of PASS than if you use
> > 'industry-standard-and-most-up-to-date-validator'. So I think this
> > why the validation steps need to be named, reported on and open to
> > configuration.
> I'm still not convinced on this one --
> True, and this is why this implementation and all it depends on are
> hopefully bug-free. 

And made of hen's teeth? :-)

>If an implementation has a bug vis-a-vis the spec,
> it needs to be fixed. Fixing it by letting someone swap it out on
> their own is less than ideal.
> If two implementations differ but neither is clearly wrong, one
> consults the reference implementation if it's important. So I think
> that using the reference parser is a good place to start -- the W3C
> one? but again letting you pick your reference implementation parser
> is letting you mix your own reference implementation of mobileOK, and
> I feel it's important to have One Clear Reference Implementation.
> Otherwise you have "mobileOK-based-on-Xerces" and
> "mobileOK-based-on-W3C-parser" and permutations thereof.

Yes, I am struggling with this too. I think it would be worth looking at
some scenarios. Like if I claim mobileOK today based on today's version
of the checker and then tomorrow you update the checker in a way that
makes me FAIL, that might reflect badly on me if someone wasn't able to
verify that I had made my claim in good faith.

It's clear to me that the reference checker may be wrong. And if it is,
then  I need to be able to say that my claim is made in good faith
because as far as I am concerned my XHTML is perfectly valid and I am
not going to make it wrong just to get it through the reference checker.
So if I say to the reference checker this is Valid-a-la-Xerces it ought
to be able to verify that. I think.

It's possible that this is something that could be resolved by a clearer
definition of 'Valid' in mobileOK. I suspect, though, that this may be a
slippery concept. And rather than trying to boil the philosophical ocean
on this, a pragmatic definition may be a better route.

Received on Tuesday, 13 March 2007 13:48:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:17 UTC