Re: Data Identification section (was Re: reviewing the BP doc)

All,
Id like to focus the discussion here on the content of the BP document. I see no reason to argue about a string of letters identifying Phil if that is not in the document. I never meant to maintain that URIs and URLs are the same thing, only that in a document where the only instances of either term do apply to URLs, the issue is moot. Ive made the effort to look through the latest published draft and search for instances of UR to identify what we have. I found no instances where a URx term was used to describe something that is not a URL. If Ive missed something, then lets continue the discussion with that instance in mind.
-Annette

--
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
510-495-2935

On Aug 19, 2015, at 1:40 PM, Makx Dekkers <mail@makxdekkers.com> wrote:

> Joo Paulo,
> 
> Your example is interesting.
> 
> Like Phil's person URI, which returns a 200 with an RDF file, your URI
> http://nemo.inf.ufes.br/jpalmeida/pet/fido *looks* very much like a URL.
> Clicking on it produces a '404' Page not found (it tells me: "You have typed
> the web address incorrectly, or the page you were looking for may have been
> moved, updated or deleted")
> 
> I am not arguing against the fact that you and Phil want to use a string
> that looks like a URL as an identifier for you as persons. The problem that
> I see is that an unsuspecting user or application will not know. 
> 
> Both http://philarcher.org/foaf.rdf#me and
> http://nemo.inf.ufes.br/jpalmeida/pet/fido look like URLs and behave like
> URLs (returning a 200 or 404 status code) -- if it walks like a duck and
> quacks like a duck...
> 
> Of course, if you confine yourself to the Semantic Web/Linked Data world,
> people and applications know what to expect and then everything is fine.
> You'll stick your 'person URI' in a dct:creator statement where an
> application expects a reference to a real-world entity, so everybody's happy
> (well not so much with your URI as you don't return information but that's a
> detail).
> 
> Once you get out of the Linked Data context, it's hard to maintain that URLs
> that look like URLs and behave like URLs are, in fact, not URLs.
> 
> All of this to say that I think that bringing up these issues in the Data on
> the Web Best Practices may confuse rather than help.
> 
> Makx.
> 
> 
>> -----Original Message-----
>> From: Joo Paulo Almeida [mailto:jpandradealmeida@gmail.com] On Behalf
>> Of Joo Paulo Almeida
>> Sent: 19 August 2015 19:42
>> To: Phil Archer <phila@w3.org>; Manuel.CARRASCO-BENITEZ@ec.europa.eu;
>> amgreiner@lbl.gov
>> Cc: public-dwbp-wg@w3.org
>> Subject: Re: Data Identification section (was Re: reviewing the BP doc)
>> 
>> Hi Phil,
>> 
>> Our messages crossed mid air.
>> 
>> I agree with you. In my e-mail I tried to give a less nuanced example, and
> also
>> to quote the same RFC-3986.
>> 
>> Regards,
>> Joo Paulo
>> 
>> 
>> On 19/8/15, 2:26 PM, "Phil Archer" <phila@w3.org> wrote:
>> 
>>> If http://philarcher.org/foaf.rdf#me were a URL you'd have me
>>> personally popping out of your screen every time you dereferenced it.
>>> 
>>> It is a URI, it is not not a URL.
>>> 
>>> I am not redefining anything, I am using the definitions as written in
>>> the specs. We have both quoted the same text from the same source and
>>> come to different conclusions.
>>> 
>>> Sorry, I don't like to be adamant about things, I'm always ready to
>>> learn new things and be corrected. I am often wrong, but on this I am
>>> confident of being correct.
>>> 
>>> The use of HTTP does not make a URI a URL. The fact that a URI
>>> identifies a resource that has a location on the network is what makes
>>> it a URL, whatever the scheme. So, to correct a mistake I made earlier,
>>> ftp://example.foo is a URL if it returns whatever is identified by that.
>>> 
>>> Dereferencing http://philarcher.org/foaf.rdf#me returns
>>> http://philarcher.org/foaf.rdf that includes information about
>>> http://philarcher.org/foaf.rdf#me.
>>> 
>>> It's a nuance, but it is what is at the heart of the difference between
>>> the two terms.
>>> 
>>> Phil
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 19/08/2015 17:25, Manuel.CARRASCO-BENITEZ@ec.europa.eu wrote:
>>>> Dear all,
>>>> 
>>>> 
>>>> * Definitions according RFC-3986
>>>> 
>>>> - URI
>>>> A Uniform Resource Identifier (URI) is a compact sequence of
>>>> characters that identifies an abstract or physical resource.
>>>> 
>>>> - URL
>>>> The term "Uniform Resource Locator" (URL) refers to the subset of
>>>> URIs that, in addition to identifying a resource, provide a means of
>>>> locating the resource by describing its primary access mechanism
>>>> (e.g., its network "location").
>>>> 
>>>> 
>>>> * Definition in RFC-3987
>>>> Internationalized Resource Identifier (IRI) by extending the syntax
>>>> of URIs to a much wider repertoire of characters.
>>>> 
>>>> 
>>>> * Interpretation
>>>> What makes a URI to be in the subset URL is the providing of means to
>>>> locate the resource, *not* the nature of the resource.
>>>> 
>>>> IRI is just and extension of the repertoire of characters. I am also
>>>> quite familiar with RFC-3987: look at the acknowledgements.
>>>> 
>>>> 
>>>> * Phil example
>>>> http://philarcher.org/foaf.rdf#me
>>>> 
>>>> This URI is a URL because the scheme HTTP provides a mean to locate
>>>> the resource. That the resource is abstract or physical does not play
>>>> a role in making a URI a URL.
>>>> 
>>>> 
>>>> * Verification
>>>> This *must* be verified, perhaps by contacting the maintainer(s) of
>>>> RFC-3986. TBL is one of the author, I know Larry Masinter, another
>>>> author. We should not need clarifications from RFC-3987; I know Martin
>>>> Duerst.
>>>> 
>>>> 
>>>> * More
>>>> We must follow the existing specifications: we cannot *redefine*
>>>> anything in there, though we can *refine* as long as we do break
>>>> anything. If one wants to express it as a hierarchy, it has to be
>>>> properly defined. The same goes for the concept of "HTTP URI" as this
>>>> is just a subset of URL.
>>>> 
>>>> 
>>>> Regards
>>>> Tomas
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Phil Archer [mailto:phila@w3.org]
>>>> Sent: Wednesday, August 19, 2015 4:30 PM
>>>> To: Annette Greiner; CARRASCO BENITEZ Manuel (DGT)
>>>> Cc: public-dwbp-wg@w3.org
>>>> Subject: Re: Data Identification section (was Re: reviewing the BP
>>>> doc)
>>>> 
>>>> Sorry Annette, on this rare occasion I must disagree with you.
>>>> 
>>>> http://philarcher.org/foaf.rdf#me is a URI. It is not a URL as it
>>>> identifies a resource, me, that, like any other physical object, or
>>>> concept, cannot be obtained over the internet. I do not have a
>>>> network location.
>>>> 
>>>> http://philarcher.org/foaf.rdf is a URL, it identifies a resource
>>>> that does have a network location, i.e. it can be obtained directly
>>>> over the internet.
>>>> 
>>>> So there's a hierarchy here of URIs, HTTP URIs and URLs.
>>>> 
>>>> As evidence, let me quote RFC 3986 (the definition of URIs,
>>>> https://www.ietf.org/rfc/rfc3986.txt), section 1.1.3:
>>>> 
>>>> 
>>>> 1.1.3. URI, URL, and URN
>>>> 
>>>> A URI can be further classified as a locator, a name, or both. The
>>>> term "Uniform Resource Locator" (URL) refers to the subset of URIs
>>>> that, in addition to identifying a resource, provide a means of
>>>> locating the resource by describing its primary access mechanism
>>>> (e.g., its network "location").
>>>> 
>>>> RFC 3987 introduces the even more general IRI which allows Unicode
>>>> characters outside the limited ASCII set.
>>>> 
>>>> The WG has made it clear that it wants to avoid providing any
>>>> discussion of the issue. That seems fine to me as it avoids
>>>> unnecessary confusion, BUT, if we're not going to say something along
>>>> the lines of "we know all these things are different but for
>>>> simplicity we'll just use the one term" then we must use the correct
> term
>> in the correct place.
>>>> 
>>>> Last week we ended up voting on a proposed resolution:
>>>> 
>>>> PROPOSED: In general URI should be used in the BP doc, but depending
>>>> on the context, URL may also be used.
>>>> 
>>>> This didn't meet with consensus - some people were unsure, Tomas was
>>>> opposed.
>>>> 
>>>> Looking at other W3C specs btw, we use IRI pretty much everywhere.
>>>> See, for example, http://www.w3.org/TR/tabular-metadata/.
>>>> 
>>>> So the hierarchy is:
>>>> 
>>>> IRI
>>>> URI
>>>> HTTP URI
>>>> URL
>>>> 
>>>> Therefore, IMO, the correct course of action in this, a technical
>>>> specification document, is to use the term IRI except where context
>>>> dictates that another term be used.
>>>> 
>>>> Phil.
>>>> 
>>>> On 13/08/2015 19:54, Annette Greiner wrote:
>>>>> For our document, URIs and URLs are the same thing, since we are not
>>>>> concerned with entities that dont have a location on the web. The
>>>>> document uses URI currently. Im fine with keeping that or using URL
>>>>> instead. Either way, my point is that we dont need to launch into a
>>>>> discussion of the differences. Im fine with a footnote referencing
>>>>> RFC
>>>>> 3986 if people feel its necessary.
>>>>> -Annette
>>>>> --
>>>>> Annette Greiner
>>>>> NERSC Data and Analytics Services
>>>>> Lawrence Berkeley National Laboratory
>>>>> 510-495-2935
>>>>> 
>>>>> On Aug 13, 2015, at 2:02 AM, Manuel.CARRASCO-
>> BENITEZ@ec.europa.eu
>>>>> wrote:
>>>>> 
>>>>>> Annette,
>>>>>> 
>>>>>> We should just use URL, the subset of URI with a network location
>>>>>> mechanism. We *cannot* redefine term such URL and we must just
>> point
>>>>>> to the source specifications: we cannot break the existing
>>>>>> specifications.
>>>>>> 
>>>>>> I agree that the document is getting to long and hence the
>>>>>> proposition to separate the identification: it is easier to produce
>>>>>> and consume.
>>>>>> 
>>>>>> Regards
>>>>>> Tomas
>>>>>> 
>>>>>> 
>>>>>> From: Annette Greiner [amgreiner@lbl.gov]
>>>>>> 
>>>>>> Sent: 12 August 2015 20:11
>>>>>> 
>>>>>> To: Phil Archer
>>>>>> 
>>>>>> Cc: CARRASCO BENITEZ Manuel (DGT); public-dwbp-wg@w3.org
>>>>>> 
>>>>>> Subject: Re: Data Identification section (was Re: reviewing the BP
>>>>>> doc)
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Aug 12, 2015, at 7:56 AM, Phil Archer <phila@w3.org> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> * ?R?
>>>>>> 
>>>>>> URI, URL, URN, IRI. Just use URI everywhere and add something like:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>   "In this specification, the term URI is used for the
>>>>>> identification schemes: URI, URL, URN and IRI ..."
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> This is line with the recommendation in RFC3986
>>>>>> 
>>>>>> https://tools.ietf.org/html/rfc3986#section-1.1.3
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>   " ... Future specifications and related documentation should use
>>>>>> the general term "URI" rather than the more restrictive terms "URL"
>>>>>> and "URN" ..."
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> But
>>>>>> we *want* to be restrictive. We're only talking about HTTP URIs,
>>>>>> we're not talking about URNs, or even URLs. Hence I think we need to
>>>>>> say something, no?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Funny, I take the fact that we want to be restricted to discussing
>>>>>> URIs as a reason *not* to add a discussion about them vs. URNs or
>>>>>> URLs. The fact that we use a term in our document doesnt mean that
>>>>>> we have to define it. It is defined elsewhere in W3C  space plenty.
>>>>>> Our document is already annoyingly long; lets help readers get to
>>>>>> what is helpful information and leave out discussion that is not
>>>>>> unique to publishing data on the web.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Annette Greiner
>>>>>> 
>>>>>> NERSC Data and Analytics Services
>>>>>> 
>>>>>> Lawrence Berkeley National Laboratory
>>>>>> 
>>>>>> 510-495-2935
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> --
>>> 
>>> 
>>> Phil Archer
>>> W3C Data Activity Lead
>>> http://www.w3.org/2013/data/
>>> 
>>> http://philarcher.org
>>> +44 (0)7887 767755
>>> @philarcher1
>>> 
>> 
> 
> 
> 

Received on Wednesday, 19 August 2015 20:59:54 UTC