RE: dwbp-ISSUE-46 (PIDs): How should we handle the issue of persistent URI design? [Use Cases & Requirements Document]

Hi,

Great discussion :)



So, proposal is to change the description of requirement R- PersistentIdentification from:

                Data should be persistently identifiable

To

                An identifier for a particular resource should be resolvable on the Web and associated for the foreseeable future with a single resource or with information about why the resource is no longer available.



My opinion is that the second description, while richer and more informative, is straying into Best Practice territory, as opposed to a requirement. A requirement should define the end goal, not how to achieve it, or?



Cheers,

Deirdre





-----Original Message-----
From: Makx Dekkers [mailto:mail@makxdekkers.com]
Sent: 01 October 2014 18:03
To: Manuel.CARRASCO-BENITEZ@ec.europa.eu; phila@w3.org; public-dwbp-wg@w3.org
Subject: RE: dwbp-ISSUE-46 (PIDs): How should we handle the issue of persistent URI design? [Use Cases & Requirements Document]



Can I suggest we drop this discussion in the group? I'd love to do some free-style wresting (conceptually, not physically) sometime, off-list, over these issues that are close to my heart (me being squarely in the 'forever' camp), but I don't think we can get any further than the text Phil suggested earlier:



R-PersistentIdentification



An identifier for a particular resource should be resolvable on the Web and associated for the foreseeable future with a single resource or with information about why the resource is no longer available.



Makx.





> -----Original Message-----

> From: Manuel.CARRASCO-BENITEZ@ec.europa.eu<mailto:Manuel.CARRASCO-BENITEZ@ec.europa.eu> [mailto:Manuel.CARRASCO-

> BENITEZ@ec.europa.eu<mailto:BENITEZ@ec.europa.eu>]

> Sent: Wednesday, October 01, 2014 5:02 PM

> To: phila@w3.org<mailto:phila@w3.org>; public-dwbp-wg@w3.org<mailto:public-dwbp-wg@w3.org>

> Subject: RE: dwbp-ISSUE-46 (PIDs): How should we handle the issue of

> persistent URI design? [Use Cases & Requirements Document]

>

> Phil,

>

> > "URI persistence is a matter of policy ..."  -

> http://www.w3.org/TR/webarch/#URI-persistence


> >

> > Having restated this, data should be identifiable *forever* - not

> for foreseeable future.

>

> True, but no one can make promises forever, only for the foreseeable

> future ;-)

>

> # Tomas

> The intention must be *forever*, though it will eventually disappear:

> it is a matter of policy

> ##

>

>   URI syntax is a different matter: one can put up with almost any

> syntax as long as it can identify the data.

>

> And there's a can of worms. The identifier may identify the data, or

> it may identify a landing page about it or something else (and some

> communities don't understand the difference and glaze over when you

> try and say it's important).

>

> # Tomas

> Agree. This is the reason why in COMURI:

>  -  "The approach is syntactic and it does not specifies the semantics

> of the URI ..."

>  - Direct identification of variants

>  - Direct identification of metadata

>

> For example:

> http://example.com/foo           # landing page

> http://example.com/foo.zip     # direct identification of data

> http://example.com/foo?         # metadata

> ##

>

> >

> > One has to assume that "web-based" means accessed with HTTP(S), so

> this implies that the data is always accessible with HTTP(S)  and in

> the *same* environment: this is not the case. For example, data

> accessible with:

> >

> >   - HTTP(S): data can be archived without the original environment -

> dynamic data will not be accessible

>

> Huh?

>

> http://example.com?service=weather&date=today


>

> dynamic data can certainly be returned from a URI (which takes us back

> to a discussion we had ages ago about URIs being APIs).

>

> # Tomas

> I did not expressed clear enough. Though URI should be forever, this

> wonderful URI weather service disappear and some kind people archive

> it into:

>  http://archive.org/example.com


>

> The original data was produce dynamically by the "foo-weather" system

> behind the server and (for whatever reason) to run "foo-weather" in

> the archival server is not possible. Hence, it would be hard to get

> the data.

>

> Archiving data is challenging, but it is a child game in comparison to

> maintain running legacy programs; this happen event to the experts :-)

>

>   -  Third World Wide Web Conference  1995 - 19 years ago : where is

> the data?

>   -  http://info.cern.ch - about 23 years ago: try to run the original

> server ##

>

> >   - FILE: no server side processing - dynamic data will not be

> accessible

>

> More true.

>

> >

> > The real world is far nastier.

>

> Very true.

>

> >

> > In a nutshell:

> >

> >   - Long-term.- Think in at least 25 to 50 years: data must

> > readable,

> and hence also identifiable

>

> If we can justify those figures (or any other), I'd be happy to

> include them. The UK National Archives reckons it can't promise beyond

> the next

> 5 years although it plans for its URIs to be as persistent as the

> original Magna Carta that it houses.

>

> # Tomas

> Good example: we need "Magna Carta URIs" :-) Can be justify not to aim

> forever?

> The URI is a component of long-term data preservation. It might useful

> to look at

>    http://www.ietf.org/rfc/rfc4810.txt


> ##

>

> >   - Simple.- Keep it very simple - minimal processing (this includes

> URI redirections) to get the data

>

> Ideally yes. But URIs that are not URLs will need to return something

> and that might be a 303 redirect (and PLEASE let's not open up

> HttpRange-14 today... or any otehr day)

>

> # Tomas

> True: we talk most of the time about URI but in fact one should be

> referring to URL.

> ##

>

> >   - Full life-cycle.- original site, archiving into archival sites,

> and offline data - http://dragoman.org/comuri.html#ultrapersistent-uri


>

> Bear in mind my issue here is about phrasing the requirements that the

> WG needs to meet (whether by COMURI, the BP doc, the vocabs or

> anything else).

>

> # Tomas

> True. What is in scope?

> The data preservation (online and offline archiving ) was taken into

> account in COMURI because the email exchange a few months ago.

> COMURI, URI, URL, is probably the smallest part.

> ##

>

> Phil

>

> Regards

> Tomas

>

>

> > -----Original Message-----

> > From: Data on the Web Best Practices Working Group Issue Tracker

> [mailto:sysbot+tracker@w3.org]

> > Sent: Wednesday, October 01, 2014 9:47 AM

> > To: public-dwbp-wg@w3.org<mailto:public-dwbp-wg@w3.org>

> > Subject: dwbp-ISSUE-46 (PIDs): How should we handle the issue of

> persistent URI design? [Use Cases & Requirements Document]

> >

> > dwbp-ISSUE-46 (PIDs): How should we handle the issue of persistent

> URI design? [Use Cases & Requirements Document]

> >

> > http://www.w3.org/2013/dwbp/track/issues/46


> >

> > Raised by: Phil Archer

> > On product: Use Cases & Requirements Document

> >

> > As of 2014-10-01, the UCR does not explicitly call for advice on URI

> design/design for persistence. It is, however, implied in R-

> PersistentIdentification which says "Data should be persistently

> identifiable."

> >

> > Do we need to add any detail to this? Or an additional requirement?

> Or do we think we've covered it?

> >

> > Context is all. In W3C space, persistent identifier means persistent

> URI. For some communities, that doesn't match the culture (scientific

> publishing for example).

> >

> >

> >

>

> --

>

>

> Phil Archer

> W3C Data Activity Lead

> http://www.w3.org/2013/data/


>

> http://philarcher.org


> +44 (0)7887 767755

> @philarcher1

Received on Thursday, 2 October 2014 12:22:30 UTC