Comments on WD-HTTP-in-RDF-20070301

Hi

A number of us from the BPWG are working on a mobileOK checker. We're keen
to use HTTP-in-RDF as part of the representation of what we are calling the
'intermediate document' - i.e. a document that results from the
pre-processing of the resource under test, which involves retrieving linked
relevant resources and testing them for validity.

In the course of working to define that document format a number of issues
have arisen which may be of interest to you. 

I should point out that these views are mine and not those of the BPWG or
the checker group.

Thanks, and hope it's useful

Jo

1. Is it in scope of this work to represent when a request fails outside of
HTTP - e.g. the response is not valid HTTP, or the TCP connection fails for
some reason or another. It would be convenient to have a consistent
representation of success and failure cases.

2. It would be useful to timestamp requests and responses.

3. I understand that there is an extension mechanism in HTTP for the request
method. It this modelled in this specification?

4. It's potentially useful to record both the absolute URI used in a request
and the relative URI that was used to form it - e.g. when checking links
from an HTML document.

5. I'm not clear as to what normalisation is pre-supposed on the contents of
the various header field values. For our purposes it would be useful to have
those values in a normal form, where possible. Equally it would be useful,
for audit purposes, to have a literal representation of the unprocessed
headers.

6. It would be useful for those header field values that have structure to
be represented so that their components are exposed in a way that allows
easy access via XPATH expressions.

7. It's a little inconvenient to have two different representations for
Headers. Is it an error to use an additionalHeader object where a specific
object could have been used?

8. You provide a linkage between a request and its response - it might be
useful to provide also a linkage between a response and a request (either
the one that it relates to, or more interestingly, perhaps, a request that
was triggered by a redirect or by following a link within the body).

9. On the representation of the Body
a. when you say XML? In the flow chart, do you mean that the content-type
indicates XML or do you mean that the content is well-formed XML? 
b. If the content is XML delivered with the content type text/html then is
this considered XML? 
c. Isn't there the possibility that a malformed document would break this
document when included. 
d. Is there an issue with the use of a CDATA section? What happens if the
data itself contain a CDATA section?
e. Is there an issue with transparency of data - in that if the body itself
contains the literal string </http:body> does this cause a problem?

10. HTTP Response code - what should one do if the response code is not one
of those enumerated?

11. It would be useful to record the size of the headers and the body.

12. I just realised that the connection structure is intended for modelling
the requests on a keep-alive connection ... the order of the requests is
significant, I suppose?

Received on Wednesday, 14 March 2007 16:10:47 UTC