Re: Self-censorship using URLs

Ronald E. Daniel (rdaniel@acl.lanl.gov)
Tue, 5 Sep 1995 14:29:02 -0600


From: "Ronald E. Daniel" <rdaniel@acl.lanl.gov>
Message-Id: <9509051429.ZM7678@idaknow.acl.lanl.gov>
Date: Tue, 5 Sep 1995 14:29:02 -0600
In-Reply-To: Larry Masinter <masinter@parc.xerox.com>
To: Larry Masinter <masinter@parc.xerox.com>
Subject: Re: Self-censorship using URLs
Cc: clarkem@unpsun1.cc.unp.ac.za, uri@bunyip.com

On Sep 5, 12:09pm, Larry Masinter wrote:

> > The URI-WG is defunct, so I suggest you monitor "ratings@junction.net".
> 
> The working group is "closed", not "defunct". The interpretation of
> "closed" is that there are no more meetings scheduled, but the mailing
> list remains open for discussion of URI related-issues that are not
> being addressed by other WGs.

Thanks for clarifying this Larry, I stand corrected.

> While ratings/censorship etc. was discussed at the IETF BOF, I think
> there's a URI issue that lurks here: many users would like to be able
> to supply some kinds of metadata along with the rest of their
> reference. 
> 
> The question is really just how we might accomodate that. For example,
> you might say that metadata like 'rating' belongs in a URC. But the
> current URC syntaxes being proposed seem too clumsy to stick inside a
> HREF.

This was not a need that had been identified until now, so the current
URC proposals have not made any attempt at being graceful when put
into HREFs. More on this in a minute.

> You should note that in this example, the *publisher* didn't supply
> the rating. Rather, it was the person making the reference.

Hmmm, I am not sure that was his intent. Matthew, would you please
clarify this point?

If Larry's interpretation is right, I like this even less than
before. It would be trivial to circumvent:

Scenario:

Little Billy is browsing at school and comes across a National
Geographic article whose illustrations are links of the form:
    <A HREF="http:bare-breasts:nat-geo.org/whatever">
Billy's school board has decided that such pictures are not acceptable
by the standards of their community, so Billy's browser has rejected
Billy's request to view those pictures. Billy writes a quick HTML
page of the form:
   <html>
   <A HREF="http:safe:nat-geo.org/whatever">here</A>
   </html>
thus circumventing the restrictions. To prevent this we can't show
URLs before they are selected, must encrypt bookmark files,
can't "view source", can't report the URLs that caused errors,
and can't have a telnet application on dear little Billy's machine.


Furthermore, while it does not mandate self-labeling, it does not
provide the freedom of choice of labeling organization that
is fundamental to all other third-party schemes that I am familiar
with. Instead I am to trust the rating supplied by some random
document author? I don't think so.

> <UL>
>  <LI> <A URC="Rating: V4; URL=http://sample.gif"> A very violent picture. </A>
>  <LI> <A URC="Rating: S5; URL=http://sexy.gif"> A very sexy picture. </A>
> </UL>

Trying to put some of this information into HTML documents could be
done, but there are problems. URCs are required to be able to
represent anything. That cannot be achieved with HTML in the
fashion you outline above. Since HTML is defined to be an SGML application,
we cannot have an arbitrary list of attributes for the <A> element.
Certainly we could talk to the HTML-WG about adding the attributes
author, title, subject, etc. to the <A> element. However, no matter
what the set proposed, I can come up with an example that needs "just
one more attribute". General-purpose elements, like 
<A meta="Author: Smith, John
         Title: Big bears in the bayous
         ...">link text</A>
won't work either because that material cannot be validated with
an SGML parser.

(Actually, defining author, title, and a few other things as attributes
of the <A> element is not so bad as long as everyone realizes that
these are 80/20 choices. Of course, not everyone will realize that).


> > As for your particular proposal, it seems very similar to the "KidCode"
> > proposal from Borenstein, et. al. (Look for it in the Internet Drafts
> > repositories).  For a variety of reasons I believe that both proposals
> > are undesirable. First, I have strong reservations
> > about the idea of encoding rating info into the URL.
> 
> Right, don't encode it _in_ the URL, put it in material that goes
> _with_ the URL.

Which would make it impossible to check at the client OR the server.


> > The whole
> > reason people are looking at URNs is because URLs already confound
> > identity and location. Adding resource description info and implicit
> > access control info is just going to aggrevate the scaling problems
> > the web is already experiencing.
> 
> Putting rating information with URLs doesn't aggrevate scaling
> problems.

I could be wrong about this, but I believe that such descriptions
are subject to change on timescales that are shorter than the
lifetime of the resource itself.

Scenario:

A publisher puts up an article on freedom of speech and rates
it for all audiences. A few days later the publisher gets email
from the proxy gateway in Singapore saying "please change rating to
"P 3" (Politics, counter to established policies of legitimate
government) or don't ever try to come into our fair country.
Lots of other countries send similar messages. Somewhat later,
requests to change the rating to "R 4" (Religion, heresey) are
received from other sites.

Mapping from old names to new names in order to fetch a resource
is not going to reduce the problems of the web any, and just might
make things a bit worse.



-- 
Ron Daniel Jr.                email: rdaniel@acl.lanl.gov
Advanced Computing Lab        voice: (505) 665-0597
MS B-287  TA-3  Bldg. 2011      fax: (505) 665-4939
Los Alamos National Lab        http://www.acl.lanl.gov/~rdaniel/
Los Alamos, NM,  87545    tautology: "Conformity is very popular"