Re: locating policy reference files from Sebastian Kamp on 2001-04-26 (www-p3p-dev@w3.org from April 2001)

From: Sebastian Kamp <kamp@ti.informatik.uni-kiel.de>
Date: Thu, 26 Apr 2001 16:20:21 +0200
To: <www-p3p-dev@w3.org>
Cc: <www-p3p-policy@w3.org>
Message-Id: <01042616202103.00448@rosmarin>
On Wednesday 25 April 2001 15:23, you wrote:

> > ...
> > Theoretically the host company could, but usually it does not know the
> > structure of the subtree (or policies covering different parts of this
> > subtree respectively) a foreign company is responsible for. All they can
> > do is therefore explicitely say that everything below the root of this
> > subtree is out of their responsibility. Otherwise the host company would 
> > have to adjust *its* policy reference file everytime the *foreign company*
> > changes the structure of its subtree/system of policies; which is most
> > certainly not what we want.
>
> Certainly true. However, we expect that for the most part hosted content
> will be things like image files, that in most cases will all have the same
> policy applying to them. If a company is going to have their entire
> site hosted elsewhere, they are likely to use virtual hosting, which
> would give them their own domain name and avoid this problem. Of
> course there are exceptions.
I did not think about virtual hosting, but I agree that the well-know 
location mechanism would be sufficient then. The first case I think 
corresponds to the last example in 2.2.1 "one case where using well-know 
locations is particulary usefull ... a site which has divided its content 
[images, Web-based applications] across several hosts".
So, as for these scenarios we don not need anything else than a well-know 
location (and in the case of a company that has its entire site hosted 
elsewhere without using virtual hosting maybe my suggestion suffices).

> > My suggestion was, that the host company just excludes the subtree from
> > its policy reference file (avoiding the 1000 entries problem, see below) 
> > and the foreign company puts its policy reference file in the root of its
> > subtree.
>
> We had considered this -- in fact, this is essentially what the PICS
> spec allows. We decided not to go down this route because of
> the added complexity (first you look in /w3c/, if no PRF is there you
> look in /foo/w3c/, if no PRF is there you lookin /foo/bar/w3c/ etc....
> how far do you go before you give up? Or maybe we say that
> you can put the PRF in either the root /w3c/directory or in a sub directory
> where the content is, but nowhere else -- so for /foo/bar/content.html
> you would look in /foo/bar/w3c/ if the PRF in /w3c doesn't apply), 
As for (p3p user agent's) software I am still sure that a well-know location 
only solution (plus something like my suggestion or what you describe i 
parantheses here) would reduce complexity (+performance +no need for safe 
zone) by far.

> and because it does not appear that it is a very common case that would
> be needed for real web sites. If someone demonstrates that in fact
> there are a lot of real web sites that would find this useful, the working
> group would look at this again.
Well, for user agents at least it would be useful. From our discussion so far 
I get the impression that web sites could go along with just well-known 
locations (plus the small modification we discuss above) without much - if at 
all - extra effort.

>
> > > Also, we have been told by some of the content distribution networks
> > > that their file system is not actually hierarchical, so it is not as
> > > simple as identifying each client with a directory.
> >
> > Im not quit sure what you mean by "not actually hierarchical". But I
> > think even if a content distribution network has a filesystem that is
> > physically organised in some non-hierarchical way there must be a mapping
> > to a logical hierarchy, since URLs are hierarchically interpretated.
>
> I had assumed that if cdn.com  hosts content for foo.com and bar.com, that
> there would be some directory structure such as cdn.com/foo/ and
> cdn.com/bar/ where all the files from foo and bar are located. But we've
> heard from at least one CDN that in fact they use some hashing algorithm    
> and so what you really get are things like                                  
> cdn.com/15390u/3048038_foo_39483048.html as file names. There might be some 
> string that is common to all the file names belonging to company foo, but   
> they aren't all going to be put in a common directory.
But doesn't the PRF at CDN still exclude the content for say foo.com in terms 
of a URL regardless of how CDN internally refers to a file 'under' 
cdn.com/foo/?

>
> Lorrie

Regards,
Sebastian
Received on Thursday, 26 April 2001 10:20:29 UTC