RE: Representing HTTP Requests Responses and Headers in moki

Hi Shadi

That's interesting. A motivation for using the value= representation is
to show that the content _has not_ been parsed. 

This seems to be particularly relevant for cookies, which seem to allow
either , or ; as a separator for their name=value pairs. So any attempt
to parse using the 'standard' algorithm will end up with unpredictable
results. Best to indicate that it has not been attempted. 

Same is true for example of the User-Agent string which does have
structure but not of an element kind, so again, show that the
decomposition has not been attempted. We don't need it for our purposes,
whereas we do need decomposition of the cache-control headers.

Another way of looking at it is to say that the element name should not
contain commas or semi colons, because commas and semi-colons are
delimiters of the parsing algorithm that results in the element in the
first place. 

I think, on the whole, that using a consistent representation for
name=value, across the header, element or parameter levels has benefits,
and preserves an important semantic distinction between values that have
been decomposed/parsed and those that haven't.

I don't think that this will limit the flexibility to transform the
resulting XML into HTTP-in-RDF, though I daresay it makes it harder to
do the reverse.

I think, in any case, that differences in processing rules may
complicate the transformation, like for example, if moki does not
decompose cookies, but an implementation of HTTP in RDF does, then the
results will be hard to compare, irrespective of what the notational
convention is. 

Jo


> -----Original Message-----
> From: Shadi Abou-Zahra [mailto:shadi@w3.org]
> Sent: 24 May 2007 11:08
> To: Jo Rabin
> Cc: Johannes Koch
> Subject: Re: Representing HTTP Requests Responses and Headers in moki
> 
> Hi Jo,
> 
> I've been following this thread with interest. Just for background, I
> believe that we took the parsing rules for HTTP-in-RDF from Apache. In
> ERT WG, we decided (also based on feedback on previous drafts) that a
> consistent approach may be more valuable than a per-header approach.
> 
> So for what its worth, I don't really see the added benefit of having
an
> exception for the date header just to compact one child element. I
think
> there may be also some value in trying to keep moki as consistent as
> possible with HTTP-in-RDF so that a simple XSLT or other
transformation
> can switch the format from one to the other. (I'm assuming that moki
was
> supposed to be an XML representation of HTTP-in-RDF plus some checker
> specific stuff, let me know if that is not anymore the case).
> 
> Regards,
>    Shadi
> 
> 
> Jo Rabin wrote:
> > Following on from yesterday's post [1] here are some further
thoughts
> > and questions as to how we represent HTTP headers in XML.
> >
> > 1. The moki prototype [2] document example models a retrieval via
HTTP
> > as:
> >
> > <retrieval>
> > 	<retrievedURI>...</retrievedURI>
> > 	<HTTPRequest/>
> > 	<HTTPResponse/>
> > 	<!-- and then a request/response pair for every redirection -->
> > 	...
> > </retrieval>
> >
> > Q1: Is there any benefit to enclosing each request response pair in
an
> > element to bind them more closely together?
> >
> > Proposed Answer: Probably not as it adds no information.
> >
> > 2. At present the proposal for modelling each header follows the
style
> > of HTTP-in-RDF - i.e. it assumes that the structure of HTTP headers
is:
> >
> > message-header  = field-name ":" [field-value]
> > field-value     = [header-element] *( "," [header-element] )
> > header-element  = element-name [ "=" [element-value ] ] *( ";"
[param] )
> > param           = param-name [ "=" [param-value] ]
> > param-value     = (token | quoted-string)
> >
> > i.e. that the header value field is composed of a sequence of
"header
> > elements".
> >
> > As noted in [1], this is actually not an accurate representation. It
> > does no harm, especially, as long as you remember not to parse e.g.
> > dates for the element separator, a comma. So for things like dates
you
> > end up with:
> >
> > <header name="date">
> > 	<element name="2007-04-20T11:30:30Z"/>
> > </header>
> >
> > And as long as you know that the value of the date is to be found as
the
> > name attribute of an element child, that's ok. But it would probably
be
> > clearer if instead of doing this we recognised that some headers are
not
> > parsed in that way and instead we said:
> >
> > <header name="date">
> > 	<value>Tue, 15 Nov 1994 08:12:31 GMT</value>
> > </header>
> >
> > Or
> >
> > <header name="date">Tue, 15 Nov 1994 08:12:31 GMT</header>
> >
> > Or
> >
> > <header name="date" value="Tue, 15 Nov 1994 08:12:31 GMT" />
> >
> > This would have the effect that the XSLT would need different ways
of
> > accessing header data. Assuming you knew which header you were going
for
> > this would not be a problem. If you wanted to iterate over all the
> > headers it would make the code a bit more complicated. On the
up-side it
> > is a more simple direct representation, more directly accessible
when
> > you need to value. On the down-side it means that if you don't know
the
> > structure of a particular header you need to test for the presence
of a
> > value attribute before accessing the data.
> >
> > Q2: Which does the team think is the best approach?
> >
> > Proposed answer: use the value= representation because then the
> > representation of unstructured values of headers, elements and
> > parameters is the same, it's the most compact way of doing it and it
> > adds to readability.
> >
> > e.g.
> >
> > <header name="date" value="Tue, 15 Nov 1994 08:12:31 GMT" />
> >
> > <header name="cache-control">
> > 	<element name="no-store"/>
> > 	<element name="no-cache"/>
> > 	<element name="must-revalidate"/>
> > 	<element name="post-check" value="0">
> > 	<element name="pre-check" value="0">
> > </header>
> >
> > <header name="accept">
> > 	<element name="application/xhtml+xml"/>
> > 	<element name="image/gif"/>
> > 	<element name="image/jpeg"/>
> > 	<element name="text/css"/>
> > 	<element name="text/html">
> > 		<parameter name="q" value="0.1" />
> > 	</element>
> > 	<element name="application/vnd.wap.xhtml+xml">
> > 		<parameter name="level" value="1" />
> > 		<parameter name="q" value="0.1" />
> > 	</element>
> > </header>
> >
> > 3. Do we want to parse non RFC 2616 headers?
> >
> > The only one I can think of (aside from authentication which is
> > referenced from RFC 2616) is cookies. But since we merely test for
the
> > presence of cookies and do not analyse them, is it worth the bother?
> >
> > Q3: Parse cookies or anything else?
> >
> > Proposed Answer: No. Render their values as value = "..."
> >
> > Jo
> >
> > [1]
> >
http://lists.w3.org/Archives/Public/public-mobileok-checker/2007May/0102
> > .html
> >
> > [2]
> >
http://lists.w3.org/Archives/Public/public-mobileok-checker/2007Apr/att-
> > 0047/moki-example-20070419.xml
> >
> >
> >
> 
> --
> Shadi Abou-Zahra     Web Accessibility Specialist for Europe |
> Chair & Staff Contact for the Evaluation and Repair Tools WG |
> World Wide Web Consortium (W3C)           http://www.w3.org/ |
> Web Accessibility Initiative (WAI),   http://www.w3.org/WAI/ |
> WAI-TIES Project,                http://www.w3.org/WAI/TIES/ |
> Evaluation and Repair Tools WG,    http://www.w3.org/WAI/ER/ |
> 2004, Route des Lucioles - 06560,  Sophia-Antipolis - France |
> Voice: +33(0)4 92 38 50 64          Fax: +33(0)4 92 38 78 22 |

Received on Thursday, 24 May 2007 10:47:56 UTC