W3C home > Mailing lists > Public > public-prov-wg@w3.org > April 2012

File:// URIs (was: Datatype property for used)

From: Graham Klyne <Graham.Klyne@zoo.ox.ac.uk>
Date: Thu, 26 Apr 2012 15:34:26 +0100
Message-ID: <4F995CF2.6040406@zoo.ox.ac.uk>
To: Timothy Lebo <lebot@rpi.edu>
CC: "public-prov-wg@w3.org" <public-prov-wg@w3.org>
(I'm changing the subject heading as I think this may be veering a little OT.)

On 26/04/2012 13:30, Timothy Lebo wrote:
> Graham,
>
> On Apr 26, 2012, at 5:53 AM, Graham Klyne wrote:
>
>> On 17/04/2012 18:48, Jim McCusker wrote:
>>> For filenames, I would figure that one should use a file:// URI. Are there
>>> any reasons not to?
>>
>> Maybe, but I'm not aware of the full context, so ignore me if this doesn't make sense.
>>
>> file:// URIs are interpreted with respect to a specific host environment (commonly localhost, but can be named).  While it's possible to separately a denotation, I think it could prove tricky to use this in a global reasoning environment.
>>
>> So, if the example is strictly local-use then file://... might be OK, but if the idea is to create provenance that can be shipped across the web I'd suggest avoiding file:// URIs.
>
> When one asserts a file path using the file scheme, I would interpret that attribute as part of the entity's characterization. It _is_ THE file at THAT path (on some assumed machine. You're right, it is more universally useful if one does NOT use absolute file paths, but in that situation we've walked up the specializationOf dimension (it's _all_ files that are found at _all_ of those relative paths on any machine).
>
> I think it's perfectly reasonable for someone to model with local file paths -- especially if that's what they intend to model.
> If one wants some "machine independence", then they should use relative file paths:
>
> :a prov:wasDerivedFrom<That_File_Over_There.txt>  ;
>
> which will evaluate to the appropriate full path when the provenance is parsed.

I've been doing something very similar in my own work with composite research 
objects.  But it's not without problems.

At heart, as I'm sure you know, a relative path is *not* a URI.  It's a 
URI-reference.  As you say, it gets resolved to a full URI when parsed, possibly 
using local context.

This is all very well when you are consciously dealing with a *copy* of the 
content, but it doesn't play so well with linked data, as the resolved URI isn't 
one that can be used in the global web environment to "follow your nose", etc.

So while I like (a lot) the ability to copy stuff about and have it still work, 
there's a tension with global identification and linked data.  I'm still not 
sure how (or if) that tension resolves in a fully general fashion.

>>
>> (This reminds me that there's a proposal for a ni: URI scheme that identifies by way of cryptographic hash, which might be useful for some aspects of provenance ... http://www.ietf.org/proceedings/81/slides/decade-3.pdf, http://tools.ietf.org/html/draft-farrell-decade-ni-04)
>
> (and this reminds me of Jim's and my IEEE IS article that combines crypto hashes, FRBR, and PROV to handle conversions between tables and many forms of RDF [1]).
>
> [1] https://github.com/timrdf/csv2rdf4lod-automation/wiki/frbr:mccusker2012parallel

Noted.

#g
--

>>> On Tue, Apr 17, 2012 at 1:44 PM, Paul Groth<p.t.groth@vu.nl>   wrote:
>>>
>>>> Hi All,
>>>>
>>>> I've place my example at
>>>> http://www.w3.org/2011/prov/wiki/Eg-27-small-command-line
>>>>
>>>> Essentially, we have the name of a zip file that is associated with an
>>>> Entity. The question is what is the recommended way to associate that
>>>> value with the entity.
>>>>
>>>> Luc's suggestion is prov:value.
>>>>
>>>> I'm inclined to suggest that this also apply to activities as well....
>>>>
>>>> Cheers
>>>> Paul
>>>>
>>>>
>>>
>>>
>>
>>
>
>
Received on Thursday, 26 April 2012 17:23:03 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 17:23:03 GMT