W3C home > Mailing lists > Public > www-validator@w3.org > November 2007

Re: Something about the caching feature in 0.9.x

From: Nikita The Spider The Spider <nikitathespider@gmail.com>
Date: Fri, 16 Nov 2007 15:55:23 -0500
Message-ID: <35e76ac10711161255tdf1de60l5a51aef0fe8b5391@mail.gmail.com>
To: "Karim A." <directeur@gmail.com>
Cc: "W3C Validator Community" <www-validator@w3.org>

On Nov 16, 2007 1:04 PM, Karim A. <directeur@gmail.com> wrote:

> Again thank you for your explanations!

You're welcome; I'm glad you've found them helpful.

> I can't agree more with everything you've said.
> So if I had to make a simple conclusion of
> the recommendations you've made:
>
> "ACCEPT WHAT THE SERVER GIVES."

Yes. For the most part it isn't the client's job to decide whether or
not something is stale or fresh. But don't take my word for it --
there's really no substitute for reading chapter 13 of RFC 2616. I
recently wrote code to implement that and I realized that there are
more subtle points to it than I expected.

> I just wonder how will the W3C validators
> handle this caching in the 0.9.x series.
>
> Any idea?

I'm not on the W3 validator development team so I don't know. Perhaps
someone on the team can answer your question?


Cheers

> On 11/15/07, Nikita The Spider The Spider <nikitathespider@gmail.com> wrote:
> > On Nov 15, 2007 1:49 PM, Karim A. <directeur@gmail.com> wrote:
> >
> > > The question which seriously troubles me is:
> > > Is it possible to have a server that returns
> > > the same etag/last-modified data even if the
> > > content has changed?
> >
> > Yes, and that makes sense in some contexts. Consider the case where a
> > page is very large and very popular so it's important to the server
> > that clients cache it. And suppose the page changes, but only to add
> > an HTML comment. The md5 sum of the page would change, but the server
> > is telling the clients that there's no significant difference in the
> > page. The etag serves two purposes here. First, it relieves clients of
> > the duty of comparing each page to cached representations to see if
> > anything has changed. Second, it allows the server -- which is
> > presumably controlled by the page author -- to judge whether or not a
> > page has changed. More accurately, it allows the server to tell
> > clients whether or not they need to consider their cached
> > representations stale.
> >
> > Read about strong and weak validators in the RFC. "One can think of a
> > strong validator as one that changes whenever the bits of an entity
> > changes, while a weak value changes whenever the meaning of an entity
> > changes."
> > http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3
> >
> >
> > > How should we handle that if it happens?
> >
> > Well, you only notice this situation if you explicitly bytewise
> > compare the page delivered and its cached representation. Unless your
> > application has a specific reason to care about changes in the bits
> > versus changes in the meaning, then you can simply trust the server
> > headers.
> >
> >
> > > On 11/15/07, Nikita The Spider The Spider <nikitathespider@gmail.com> wrote:
> > > > On Nov 15, 2007 9:47 AM, Karim A. <directeur@gmail.com> wrote:
> > > > >
> > > > > I read here: http://validator.w3.org/todo.html
> > > > > that in the 0.9.x series you'll start using Last-Modified
> > > > > to cache validation results and request again only
> > > > > if-modified-since.
> > > > >
> > > > > I'm very interested in that since the release of
> > > > > our humble project http://xhtml-css.com
> > > > > and I struggle with some say chaching "standards".
> > > > > Not all servers provide "Last-Modified", some provide
> > > > > "etag", other servers nothing and some others both![1]
> > > >
> > > > Hi Karim,
> > > > Providing both is valid. According to the HTTP 1.1 spec, "[T]he
> > > > preferred behavior for an HTTP/1.1 origin server is to send both a
> > > > strong entity tag and a Last-Modified value."
> > > > http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.4
> > > >
> > > > A server may also legitimately not send any cache information if it
> > > > doesn't want its responses to be cached. When no cache information is
> > > > sent, a strictly-complying user agent must assume that what it has in
> > > > its cache is stale. In this case the UA can still use what is in its
> > > > cache ("A client MAY also specify that it will accept stale
> > > > responses...") and some browsers do so very aggresively to improve
> > > > performance.
> > > > http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.1.6
> > > >
> > > >
> > > > > The best, of course, will be to use Last-Modified
> > > > > and etag, but I'm not sure how reliable they are.
> > > >
> > > > Good news -- it isn't your job to decide how reliable they are. If the
> > > > server sends out this information, your user agent must respect it or
> > > > be in violation of RFC 2616. If the information looks odd (e.g. a
> > > > Last-Modified date of 1 second ago along with an ETag that's the same
> > > > as the one you saw one month ago), that's the business of the server
> > > > admin. Caching is complicated enough without trying to second-guess
> > > > what the server sends out. =)
> > > >
> > > > Good luck
> > > >
> > > > --
> > > > Philip
> > > > http://NikitaTheSpider.com/
> > > > Whole-site HTML validation, link checking and more
> > > >
> > >
> >
> >
> >
> >
> > --
> > Philip
> > http://NikitaTheSpider.com/
> > Whole-site HTML validation, link checking and more
> >
>



-- 
Philip
http://NikitaTheSpider.com/
Whole-site HTML validation, link checking and more
Received on Friday, 16 November 2007 20:55:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:27 GMT