W3C home > Mailing lists > Public > www-tag@w3.org > November 2010

Re: "tdb" and "duri" URI schemes...

From: Nathan <nathan@webr3.org>
Date: Thu, 04 Nov 2010 01:20:17 +0000
Message-ID: <4CD20A51.9030105@webr3.org>
To: Larry Masinter <masinter@adobe.com>
CC: Jonathan Rees <jar@creativecommons.org>, "www-tag@w3.org" <www-tag@w3.org>
Larry Masinter wrote:
> This idea has been bouncing around for such a long time,
> but I updated the document 
> 
> http://tools.ietf.org/html/draft-masinter-dated-uri-07

Hi Larry,

some proper feedback for you this time, comments out-dented:


    This document defines two URI schemes.  The first, 'duri' (standing
    for "dated URI"), allows indicating a URI as of a particular date
    (and time)...

As Jonathan commented, suggest [[[
  allows indicating a resource as of a particular date (and time).
]]]

    The second scheme, 'tdb' ( standing for "Thing Described By"),
    provides a way of using a way of minting URIs for anything that can
    be described, with the ability to fix the description to a given date
    or time.  The 'tdb' URI scheme may reduce the need to define define
    new URN namespaces merely for the purpose of creating stable
    identifiers for concepts or abstractions: it provides a ready means
    for identifying "non-information resources" by semantic indirection
    -- a way of creating a URI for anything.

Having difficulty making sense of "provides a way of using a way of 
minting URIs for", similarly defining what "anything that can be 
described is" (as opposed to "anything"), unsure if the "reduce the need 
to define new URN namespaces for" sentence is needed here.

Suggest [[[
  The second scheme, 'tdb' (standing for "Thing Described By"),
  allows indicating the thing described by a resource as of a
  particular date (and time). It also provides a ready means for
  identifying "non-information resources" by semantic indirection
  -- a way of creating a URI for anything.
]]]

      2.1.  'duri' Syntax  . . . . . . . . . . . . . . . . . . . . . .  6
      2.2.  tdb Syntax . . . . . . . . . . . . . . . . . . . . . . . .  6

should be [ 'tdb' Syntax ] with single quotes.

      7.1.  'duri' Scheme Template . . . . . . . . . . . . . . . . . . 13
      7.2.  tdb Scheme Template  . . . . . . . . . . . . . . . . . . . 13

same again missing the single quotes, [ 'tdb' Scheme Template ]

    In some cases, the guarantee of persistence comes through a promise
    of good management practice, such as is encouraged in "Cool URLs
    don't change" [COOL].  However, relying on promise of good management

Is the reference not to [ "Cool URIs don't change" ] ? (URIs rather than 
URLs)

    guarantee stability over time.  Despite best efforts and intentions,
    identifying information can change in unpredictable ways: domain
    names can disappear or be reassigned, name assigning organizations
    can change structure, responsibility, disappear, merge, or change in
    unpredictable ways.

Having trouble with "identifying information can change in unpredictable 
ways: " suggest [[[
  Despite best efforts and intentions, identified information can change
  unpredictably over time:
]]]

    There is a significant dependence in the interpretation of many URNs
    with the concept of "naming authority".  The authority is presumably
    some individual or organization both to insure uniqueness of
    assignment and also to help with understanding the meaning of the
    link between the name and the named.

Suggest [[[
  There is a significant dependence in the interpretation of many URNs
  from the concept of a "naming authority".  The authority is usually
  some individual or organization, both to insure uniqueness of
  assignment, and also to help with understanding the meaning of the
  link between the name and the thing named.
]]]

    However, authorities, whether individuals or organizations, have a
    lifetime, and must be consulted at some point to understand the
    bindings.  The functioning of names as unique identifiers and holders
    of meaning depends on having a reliable infrastructure of consulting
    the authority or the authorities records to determine the thing
    referenced.

Suggest swapping the final words "thing referenced" for [ thing named ].


    One might use a URI such as "mailto:" email address to identify a
    person, or a "http:" URI to identify an abstract comment.  However,
    this leaves the question of how one might identify, within the same
    context, both the system mailbox and the person to which it is
    assigned, or the web page at a http URI and the concept it describes.
    The 'tdb' URI scheme allows ready assignment of URIs for abstractions
    that are distinguished from the media content that describes them.

mailto identifies a mailbox not a person, and doesn't describe it, also 
a minor omission of a word and swapped to double-quotes rather than 
single quotes around mailto. "a http" or "an http"? also missing 
single-quotes and colon on http. Bit of a worry about the indication of 
1-1 mapping between and what's described, as indicated by "thing 
described by" and the text "the concept it describes" - could easily 
describe several "things". Final sentence may benefit from a re-word.

Suggest [[[
  One might use an 'http:' URI to identify a web page about a certain
  thing. However, this leaves the question of how one might identify,
  within the same context, both the web page and the thing it describes.

  The 'tdb' URI scheme addresses this issue by allowing ready assignment
  of identifiers for things which are described, distinguished from the
  identifier of the media content which describes it.
]]]

    The goal, then, of the 'tdb' URI scheme is to provide a mechanism
    which is, at the same time:

consider removing ", then," to read more assertive [ The goal of the ]

...
       explicitly bound: The mechanism by which the identified resource
       can be determined is explicitly included in the URI.

is this true?

       useful for non-networked items: Allows identification of resources
       outside the network: people, organizations, abstract concepts.

suggest [[[
   useful for non-networked items: Allows identification of resources
   outside the network: people, organizations, anything.
]]]

       no administration: The mechanism does not depend on reliable
       administrative processes of authorities for either assignment or
       interpretation.

suggest [[[
   no administration: The mechanism does not depend on administrative
   processes or authorities for either assignment or interpretation.
]]]



    A 'duri' URI takes the form:
         duri:<timestamp>:<encoded-URI>

    where <timestamp> is s sequence of digits representing a date and
    time (Section 2.4) and <encoded-URI> is an absolute URI-reference
    [RFC3986] in which any reserved character other than "/" have been
    percent-encoded (Section 2.3).  Note that the URI which has been
    encoded MAY include a fragment identifier.


I've reviewed this section and Section 2.3 thoroughly, several times 
over, and would first suggest that an absolute URI-reference with an 
optional fragment is just a "URI" as per the ABNF of RFC 3986 (not 
URI-reference).

Secondly, I'd suggest that the encoded-URI should drop the encoding part 
all together. Section 2.3 only states that "#" and "%" must be encoded. 
And further Section 2.3 states that this is because the fragment of a 
uri within a 'tdb' may include the description, however the description 
could be any where within the URI, indeed even the examples promote this 
usage: "tdb:2001:data:,The%20US%20president". And the "%" encoding looks 
like nothing but problems ahead.

I would suggest, that unless there is a strong technical 
interoperability reason that encoded-URI be respecified to say that all 
URIs MUST be percent-encoding normalised as per RFC3898 sections 2.3 
(http://tools.ietf.org/html/rfc3986#section-2.3) and 6.2.2.2 
(http://tools.ietf.org/html/rfc3986#section-6.2.2.2) before embedding 
within a duri/tdb.

suggested text
[[[
  A 'duri' URI takes the form:
    duri:<timestamp>:<normalized-URI>

  where <timestamp> is a sequence of digits representing a date and
  time (Section 2.4), and <normalized-URI> is a URI as defined
  in [RFC3986], of the form:

    scheme ":" hier-part [ "?" query ] [ "#" fragment ]

  Percent-encoding normalization of URIs MUST be performed before
  embedding (Section 2.3).
]]]

and then change 2.3 to something like:

[[[
2.3.  normalized-URI encoding

  Percent-Encoding Normalization as described in
  http://tools.ietf.org/html/rfc3986#section-2.3 MUST be applied to all
  URIs before being embedded within a 'duri' or 'tdb'. (reasons..)
]]]



    where non-terminals "date-fullyear", "date-month", "date-mday",
    "time-hour", "time-minute", "time-second", "time-secfrac" are taken
    from [RFC3339].  The goal was to minimize the amount of precision
    needed, while retaining the possibility of generating timestamps that
    are exactly compatible with [RFC3339] "date-time" non-terminal.

repetition of "The goal was to minimize" suggest [ This is to minimize ] 
instead


Will cover section 3 onwards tomorrow with fresh eyes and pass on,

Best,

Nathan
Received on Thursday, 4 November 2010 01:21:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:28 GMT