Re: Data Identification section (was Re: reviewing the BP doc) from Phil Archer on 2015-08-19 (public-dwbp-wg@w3.org from August 2015)

From: Phil Archer <phila@w3.org>
Date: Wed, 19 Aug 2015 18:26:58 +0100
To: Manuel.CARRASCO-BENITEZ@ec.europa.eu, amgreiner@lbl.gov
Cc: public-dwbp-wg@w3.org
Message-ID: <55D4BC62.9070201@w3.org>
If http://philarcher.org/foaf.rdf#me were a URL you'd have me personally 
popping out of your screen every time you dereferenced it.

It is a URI, it is not not a URL.

I am not redefining anything, I am using the definitions as written in 
the specs. We have both quoted the same text from the same source and 
come to different conclusions.

Sorry, I don't like to be adamant about things, I'm always ready to 
learn new things and be corrected. I am often wrong, but on this I am 
confident of being correct.

The use of HTTP does not make a URI a URL. The fact that a URI 
identifies a resource that has a location on the network is what makes 
it a URL, whatever the scheme. So, to correct a mistake I made earlier, 
ftp://example.foo is a URL if it returns whatever is identified by that.

Dereferencing http://philarcher.org/foaf.rdf#me returns 
http://philarcher.org/foaf.rdf that includes information about 
http://philarcher.org/foaf.rdf#me.

It's a nuance, but it is what is at the heart of the difference between 
the two terms.

Phil






On 19/08/2015 17:25, Manuel.CARRASCO-BENITEZ@ec.europa.eu wrote:
> Dear all,
>
>
> * Definitions according RFC-3986
>
> - URI
> A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource.
>
> - URL
> The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location").
>
>
> * Definition in RFC-3987
> Internationalized Resource Identifier (IRI) by extending the syntax of URIs to a much wider repertoire of characters.
>
>
> * Interpretation
> What makes a URI to be in the subset URL is the providing of means to locate the resource, *not* the nature of the resource.
>
> IRI is just and extension of the repertoire of characters. I am also quite familiar with RFC-3987: look at the acknowledgements.
>
>
> * Phil example
> http://philarcher.org/foaf.rdf#me
>
> This URI is a URL because the scheme HTTP provides a mean to locate the resource. That the resource is abstract or physical does not play a role in making a URI a URL.
>
>
> * Verification
> This *must* be verified, perhaps by contacting the maintainer(s) of RFC-3986. TBL is one of the author, I know Larry Masinter, another author. We should not need clarifications from RFC-3987; I know Martin Duerst.
>
>
> * More
> We must follow the existing specifications: we cannot *redefine* anything in there, though we can *refine* as long as we do break anything. If one wants to express it as a hierarchy, it has to be properly defined. The same goes for the concept of "HTTP URI" as this is just a subset of URL.
>
>
> Regards
> Tomas
>
>
> -----Original Message-----
> From: Phil Archer [mailto:phila@w3.org]
> Sent: Wednesday, August 19, 2015 4:30 PM
> To: Annette Greiner; CARRASCO BENITEZ Manuel (DGT)
> Cc: public-dwbp-wg@w3.org
> Subject: Re: Data Identification section (was Re: reviewing the BP doc)
>
> Sorry Annette, on this rare occasion I must disagree with you.
>
> http://philarcher.org/foaf.rdf#me is a URI. It is not a URL as it
> identifies a resource, me, that, like any other physical object, or
> concept, cannot be obtained over the internet. I do not have a network
> location.
>
> http://philarcher.org/foaf.rdf is a URL, it identifies a resource that
> does have a network location, i.e. it can be obtained directly over the
> internet.
>
> So there's a hierarchy here of URIs, HTTP URIs and URLs.
>
> As evidence, let me quote RFC 3986 (the definition of URIs,
> https://www.ietf.org/rfc/rfc3986.txt), section 1.1.3:
>
>
> 1.1.3. URI, URL, and URN
>
> A URI can be further classified as a locator, a name, or both. The
> term "Uniform Resource Locator" (URL) refers to the subset of URIs
> that, in addition to identifying a resource, provide a means of
> locating the resource by describing its primary access mechanism
> (e.g., its network "location").
>
> RFC 3987 introduces the even more general IRI which allows Unicode
> characters outside the limited ASCII set.
>
> The WG has made it clear that it wants to avoid providing any discussion
> of the issue. That seems fine to me as it avoids unnecessary confusion,
> BUT, if we're not going to say something along the lines of "we know all
> these things are different but for simplicity we'll just use the one
> term" then we must use the correct term in the correct place.
>
> Last week we ended up voting on a proposed resolution:
>
> PROPOSED: In general URI should be used in the BP doc, but depending on
> the context, URL may also be used.
>
> This didn't meet with consensus - some people were unsure, Tomas was
> opposed.
>
> Looking at other W3C specs btw, we use IRI pretty much everywhere. See,
> for example, http://www.w3.org/TR/tabular-metadata/.
>
> So the hierarchy is:
>
> IRI
> URI
> HTTP URI
> URL
>
> Therefore, IMO, the correct course of action in this, a technical
> specification document, is to use the term IRI except where context
> dictates that another term be used.
>
> Phil.
>
> On 13/08/2015 19:54, Annette Greiner wrote:
>> For our document, URIs and URLs are the same thing, since we are not concerned with entities that don’t have a location on the web. The document uses URI currently. I’m fine with keeping that or using URL instead. Either way, my point is that we don’t need to launch into a discussion of the differences. I’m fine with a footnote referencing RFC 3986 if people feel it’s necessary.
>> -Annette
>> --
>> Annette Greiner
>> NERSC Data and Analytics Services
>> Lawrence Berkeley National Laboratory
>> 510-495-2935
>>
>> On Aug 13, 2015, at 2:02 AM, Manuel.CARRASCO-BENITEZ@ec.europa.eu wrote:
>>
>>> Annette,
>>>
>>> We should just use URL, the subset of URI with a network location mechanism. We *cannot* redefine term such URL and we must just point to the source specifications: we cannot break the existing specifications.
>>>
>>> I agree that the document is getting to long and hence the proposition to separate the identification: it is easier to produce and consume.
>>>
>>> Regards
>>> Tomas
>>>
>>>
>>> From: Annette Greiner [amgreiner@lbl.gov]
>>>
>>> Sent: 12 August 2015 20:11
>>>
>>> To: Phil Archer
>>>
>>> Cc: CARRASCO BENITEZ Manuel (DGT); public-dwbp-wg@w3.org
>>>
>>> Subject: Re: Data Identification section (was Re: reviewing the BP doc)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Aug 12, 2015, at 7:56 AM, Phil Archer <phila@w3.org> wrote:
>>>
>>>
>>>
>>>
>>>
>>> * ?R?
>>>
>>> URI, URL, URN, IRI. Just use URI everywhere and add something like:
>>>
>>>
>>>
>>>    "In this specification, the term URI is used for the identification schemes: URI, URL, URN and IRI ..."
>>>
>>>
>>>
>>> This is line with the recommendation in RFC3986
>>>
>>> https://tools.ietf.org/html/rfc3986#section-1.1.3
>>>
>>>
>>>
>>>    " ... Future specifications and related documentation should use the general term "URI" rather than the more restrictive terms "URL" and "URN" ..."
>>>
>>>
>>>
>>> But
>>> we *want* to be restrictive. We're only talking about HTTP URIs, we're not talking about URNs, or even URLs. Hence I think we need to say something, no?
>>>
>>>
>>>
>>>
>>> Funny, I take the fact that we want to be restricted to discussing URIs as a reason *not* to add a discussion about them vs. URNs or URLs. The fact that we use a term in our document doesn’t mean that we have to define it. It is defined elsewhere in W3C
>>> space plenty. Our document is already annoyingly long; let’s help readers get to what is helpful information and leave out discussion that is not unique to publishing data on the web.
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Annette Greiner
>>>
>>> NERSC Data and Analytics Services
>>>
>>> Lawrence Berkeley National Laboratory
>>>
>>> 510-495-2935
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Wednesday, 19 August 2015 17:27:07 UTC