RE: dwbp-ISSUE-46 (PIDs): How should we handle the issue of persistent URI design? [Use Cases & Requirements Document] from Makx Dekkers on 2014-10-01 (public-dwbp-wg@w3.org from October 2014)

From: Makx Dekkers <mail@makxdekkers.com>
Date: Wed, 1 Oct 2014 19:03:13 +0200
To: <Manuel.CARRASCO-BENITEZ@ec.europa.eu>, <phila@w3.org>, <public-dwbp-wg@w3.org>
Message-ID: <000601cfdd99$97d13a00$c773ae00$@makxdekkers.com>
Can I suggest we drop this discussion in the group? I'd love to do some free-style wresting (conceptually, not physically) sometime, off-list, over these issues that are close to my heart (me being squarely in the 'forever' camp), but I don't think we can get any further than the text Phil suggested earlier:

R-PersistentIdentification

An identifier for a particular resource should be resolvable on the Web 
and associated for the foreseeable future with a single resource or with 
information about why the resource is no longer available.

Makx.


> -----Original Message-----
> From: Manuel.CARRASCO-BENITEZ@ec.europa.eu [mailto:Manuel.CARRASCO-
> BENITEZ@ec.europa.eu]
> Sent: Wednesday, October 01, 2014 5:02 PM
> To: phila@w3.org; public-dwbp-wg@w3.org
> Subject: RE: dwbp-ISSUE-46 (PIDs): How should we handle the issue of
> persistent URI design? [Use Cases & Requirements Document]
> 
> Phil,
> 
> > "URI persistence is a matter of policy ..."  -
> http://www.w3.org/TR/webarch/#URI-persistence
> >
> > Having restated this, data should be identifiable *forever* - not
> for foreseeable future.
> 
> True, but no one can make promises forever, only for the foreseeable
> future ;-)
> 
> # Tomas
> The intention must be *forever*, though it will eventually disappear:
> it is a matter of policy
> ##
> 
>   URI syntax is a different matter: one can put up with almost any
> syntax as long as it can identify the data.
> 
> And there's a can of worms. The identifier may identify the data, or
> it
> may identify a landing page about it or something else (and some
> communities don't understand the difference and glaze over when you
> try
> and say it's important).
> 
> # Tomas
> Agree. This is the reason why in COMURI:
>  -  "The approach is syntactic and it does not specifies the semantics
> of the URI ..."
>  - Direct identification of variants
>  - Direct identification of metadata
> 
> For example:
> http://example.com/foo           # landing page
> http://example.com/foo.zip     # direct identification of data
> http://example.com/foo?         # metadata
> ##
> 
> >
> > One has to assume that "web-based" means accessed with HTTP(S), so
> this implies that the data is always accessible with HTTP(S)  and in
> the *same* environment: this is not the case. For example, data
> accessible with:
> >
> >   - HTTP(S): data can be archived without the original environment -
> dynamic data will not be accessible
> 
> Huh?
> 
> http://example.com?service=weather&date=today
> 
> dynamic data can certainly be returned from a URI (which takes us back
> to a discussion we had ages ago about URIs being APIs).
> 
> # Tomas
> I did not expressed clear enough. Though URI should be forever, this
> wonderful URI weather service disappear and some kind people archive
> it into:
>  http://archive.org/example.com
> 
> The original data was produce dynamically by the "foo-weather" system
> behind the server and (for whatever reason) to run "foo-weather" in
> the archival server is not possible. Hence, it would be hard to get
> the data.
> 
> Archiving data is challenging, but it is a child game in comparison to
> maintain running legacy programs; this happen event to the experts :-)
> 
>   -  Third World Wide Web Conference  1995 - 19 years ago : where is
> the data?
>   -  http://info.cern.ch - about 23 years ago: try to run the original
> server
> ##
> 
> >   - FILE: no server side processing - dynamic data will not be
> accessible
> 
> More true.
> 
> >
> > The real world is far nastier.
> 
> Very true.
> 
> >
> > In a nutshell:
> >
> >   - Long-term.- Think in at least 25 to 50 years: data must readable,
> and hence also identifiable
> 
> If we can justify those figures (or any other), I'd be happy to
> include
> them. The UK National Archives reckons it can't promise beyond the
> next
> 5 years although it plans for its URIs to be as persistent as the
> original Magna Carta that it houses.
> 
> # Tomas
> Good example: we need "Magna Carta URIs" :-)
> Can be justify not to aim forever?
> The URI is a component of long-term data preservation. It might useful
> to look at
>    http://www.ietf.org/rfc/rfc4810.txt
> ##
> 
> >   - Simple.- Keep it very simple - minimal processing (this includes
> URI redirections) to get the data
> 
> Ideally yes. But URIs that are not URLs will need to return something
> and that might be a 303 redirect (and PLEASE let's not open up
> HttpRange-14 today... or any otehr day)
> 
> # Tomas
> True: we talk most of the time about URI but in fact one should be
> referring to URL.
> ##
> 
> >   - Full life-cycle.- original site, archiving into archival sites,
> and offline data - http://dragoman.org/comuri.html#ultrapersistent-uri
> 
> Bear in mind my issue here is about phrasing the requirements that the
> WG needs to meet (whether by COMURI, the BP doc, the vocabs or
> anything
> else).
> 
> # Tomas
> True. What is in scope?
> The data preservation (online and offline archiving ) was taken into
> account in COMURI because the email exchange a few months ago.
> COMURI, URI, URL, is probably the smallest part.
> ##
> 
> Phil
> 
> Regards
> Tomas
> 
> 
> > -----Original Message-----
> > From: Data on the Web Best Practices Working Group Issue Tracker
> [mailto:sysbot+tracker@w3.org]
> > Sent: Wednesday, October 01, 2014 9:47 AM
> > To: public-dwbp-wg@w3.org
> > Subject: dwbp-ISSUE-46 (PIDs): How should we handle the issue of
> persistent URI design? [Use Cases & Requirements Document]
> >
> > dwbp-ISSUE-46 (PIDs): How should we handle the issue of persistent
> URI design? [Use Cases & Requirements Document]
> >
> > http://www.w3.org/2013/dwbp/track/issues/46
> >
> > Raised by: Phil Archer
> > On product: Use Cases & Requirements Document
> >
> > As of 2014-10-01, the UCR does not explicitly call for advice on URI
> design/design for persistence. It is, however, implied in R-
> PersistentIdentification which says "Data should be persistently
> identifiable."
> >
> > Do we need to add any detail to this? Or an additional requirement?
> Or do we think we've covered it?
> >
> > Context is all. In W3C space, persistent identifier means persistent
> URI. For some communities, that doesn't match the culture (scientific
> publishing for example).
> >
> >
> >
> 
> --
> 
> 
> Phil Archer
> W3C Data Activity Lead
> http://www.w3.org/2013/data/
> 
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
Received on Wednesday, 1 October 2014 17:03:45 UTC