Comments on Architectural Principles of the World Wide Web from Joseph Reagle on 2002-09-04 (www-tag@w3.org from September 2002)

From: Joseph Reagle <reagle@w3.org>
Date: Wed, 4 Sep 2002 12:54:21 -0400
To: Ian Jacobs <ij@w3.org>
Cc: www-tag@w3.org
Message-Id: <200209041254.21502.reagle@w3.org>
http://www.w3.org/TR/2002/WD-webarch-20020830/

    2. Formats. A nonexclusive set of data format specifications designed
       for interchange between agents in the system. This includes
       several data formats used in isolation or in combination (e.g.,
       XHTML, CSS, PNG, XLink, RDF, SMIL animation), as well as
       technologies for designing new data formats (XML, XML Namespaces).

Is RDF rightfully considered a format? I think of it as a data-model that 
can be expressed in the XML format. (The use of "data format" to describe 
an application probably is correct, but counter to my intuition.)

   Editor's note: While people agree that URIs identify resources (per
   [RFC2396]), there is not yet consensus that absolute URI references
   with fragment identifies may be used to identify resources. Some
   people contend that an absolute URI reference with a fragment
   identifier identifies a portion of a representation.

I don't know what the exact answer to the latter part of the paragraph is, 
but xmldsig/xenc certainly uses frag-id's as part of the identifier...

  2.1. Resources, URIs, and the shared information space
   Use absolute URI references: All important resources SHOULD be
   identified by an absolute URI reference.1

What exceptions are permitted? Is this to accommodate the use of QNames as 
identifiers? (Can one have an identifier without a "resource"?)

    2.2.1. Comparison of identifiers
   Issue: URIEquivalence-15: When are two URI variants considered
   equivalent?

I raised this issue with the TAG a while back based on a question from Oasis 
folks. FYI: I thought SAML did a very specific character for character 
equivalence (to match between an accessed resource and the security policy 
corresponding to it), but I no longer can find appropriate text and the 
XACML work appears to permit a rather open interpretation. (However, I had 
a difficult time finding and reading though Oasis documents as they're in 
PDF, Word, and zip...)

[[ http://www.oasis-open.org/committees/rights/#documents
9.11.Resource Matching
A common example of this is a Web Server.  Commercial http responders permit 
a variety of syntaxes to be treated equivalently.  The “%” can be used to 
represent characters by hex value.  In the URL path “/../” provides 
multiple ways of specifying the same value.  Multiple character sets may be 
permitted and in some cases, the same printed character can be represented 
by different binary values.  If the policy target matching algorithm 
considers two resource strings to be different, and the underlying Web 
server considers them to be the same, this may allow unintended access.
The usual solution to this problem is to put the request in a canonical form 
before matching. There may be practical difficulties with this strategy if 
the transformations are not completely documented or subject to change 
without notice from one version to the next.  It is important to be aware 
of this issue and perform careful checking of marginal cases
]]

    2.2.4. Absolute URI references and context-sensitivity

   Each absolute URI reference unambiguously identifies one resource, but
   the resource itself may be defined in a context-sensitive manner. 

Is the actual language (e.g., content negotiation) considered a different 
representation of the same resource?

  2.4. URI Schemes

   Do not use unregistered URI schemes: Unregistered URI schemes MUST NOT
   be used on the public Internet.

I'd move this box below the following paragraph, "The IANA" because what it 
means to be a registered or unregegistered media type is not yet defined.
Received on Wednesday, 4 September 2002 12:54:28 UTC