Re: Comments on WD-HTTP-in-RDF-20070301

Hi Jo,

Thank you for your comments on our internal draft. We were about to 
publish this document as an updated Working Draft in the next few days 
so don't be surprised if you see it pop up, we will address your 
comments either before or after publication. Please find some initial 
responses to your comments inline below:


Jo Rabin wrote:
> 1. Is it in scope of this work to represent when a request fails outside of
> HTTP - e.g. the response is not valid HTTP, or the TCP connection fails for
> some reason or another. It would be convenient to have a consistent
> representation of success and failure cases.

The current scope is strictly focused on recording the request/response 
messages rather than making any statements about them (such as "valid", 
"successful", "failed", etc.). Can you elaborate a use case scenario to 
help us consider such a scope extension?


> 2. It would be useful to timestamp requests and responses.

Also pushes the scope a little even though it seems useful, I'll bring 
this back to the group.


> 3. I understand that there is an extension mechanism in HTTP for the request
> method. It this modelled in this specification?

Do you mean for providing additional headers or something else? Would be 
good if you can give a pointer to an RFC or such.


> 4. It's potentially useful to record both the absolute URI used in a request
> and the relative URI that was used to form it - e.g. when checking links
> from an HTML document.

Currently we only record whatever was sent to/from the server without 
any transformation or interpretation. Even such an expantion is left to 
the application level.


> 5. I'm not clear as to what normalisation is pre-supposed on the contents of
> the various header field values. For our purposes it would be useful to have
> those values in a normal form, where possible. Equally it would be useful,
> for audit purposes, to have a literal representation of the unprocessed
> headers.

Can you reformulate the question, I'm not sure I quite understand it.


> 6. It would be useful for those header field values that have structure to
> be represented so that their components are exposed in a way that allows
> easy access via XPATH expressions.

Do you mean pre-parsing the literal values and expressing them in RDF 
vocabulary? For example, currently the content-type header contains a 
literal value like "application/xhtml+xml". This is the string sent by 
the server as-is. Do you mean having explicit RDF terms for certain 
values such as "application/xhtml+xml"?


> 7. It's a little inconvenient to have two different representations for
> Headers. Is it an error to use an additionalHeader object where a specific
> object could have been used?

Yes, we should elaborate this (possibly in the conformance section) 
-additionalHeader is only intended to be used for expressing headers 
that are not listed in the schema.


> 8. You provide a linkage between a request and its response - it might be
> useful to provide also a linkage between a response and a request (either
> the one that it relates to, or more interestingly, perhaps, a request that
> was triggered by a redirect or by following a link within the body).

The first case should be covered by RDF, you can query the data to find 
all requests that are related to a response.

As to the latter case, this would require an interpretation of the 
content and is currently out of scope. Specifically different user 
agents may trigger different reactions to a response, for example load 
or ignore linked CSS pages, RSS feeds etc. It would be significantly 
tricky to relate requests/response thus currently we only record the 
mere interaction.


> 9. On the representation of the Body
> a. when you say XML? In the flow chart, do you mean that the content-type
> indicates XML or do you mean that the content is well-formed XML? 

Well-formed XML.


> b. If the content is XML delivered with the content type text/html then is
> this considered XML? 

Yes, if it is well-formed (and doesn't break the RDF document it is
embedded in).


> c. Isn't there the possibility that a malformed document would break this
> document when included. 

It needs to be well-formed, and an additional check for impact on the 
RDF environment is also required (but not elaborated in the description 
of the algorithm, we will fix this).


> d. Is there an issue with the use of a CDATA section? What happens if the
> data itself contain a CDATA section?

Yup, we missed a "does it break the RDF? -> then record as a byte
sequence" step in the algorithm.


> e. Is there an issue with transparency of data - in that if the body itself
> contains the literal string </http:body> does this cause a problem?

See above.


> 10. HTTP Response code - what should one do if the response code is not one
> of those enumerated?

Good point, we have a closed enumeration right now. This is because it 
is the enumeration in the respective RFC but I agree that it should be 
extensible for other purposes.


> 11. It would be useful to record the size of the headers and the body.

You keep trying to push the scope, ay? ;) Similar to the timestamps in 
#2, this seems like a scope creep though quite easy to do.


> 12. I just realised that the connection structure is intended for modelling
> the requests on a keep-alive connection ... the order of the requests is
> significant, I suppose?

Yes, I think there should be a sequence list somewhere (currently the 
order is not captured by default).


Regards,
   Shadi


-- 
Shadi Abou-Zahra     Web Accessibility Specialist for Europe |
Chair & Staff Contact for the Evaluation and Repair Tools WG |
World Wide Web Consortium (W3C)           http://www.w3.org/ |
Web Accessibility Initiative (WAI),   http://www.w3.org/WAI/ |
WAI-TIES Project,                http://www.w3.org/WAI/TIES/ |
Evaluation and Repair Tools WG,    http://www.w3.org/WAI/ER/ |
2004, Route des Lucioles - 06560,  Sophia-Antipolis - France |
Voice: +33(0)4 92 38 50 64          Fax: +33(0)4 92 38 78 22 |

Received on Thursday, 15 March 2007 09:43:33 UTC