Re: Naming Conventions for URIs from David Booth on 2015-08-21 (semantic-web@w3.org from August 2015)

From: David Booth <david@dbooth.org>
Date: Fri, 21 Aug 2015 15:58:08 -0400
To: Hans Teijgeler <hans.teijgeler@quicknet.nl>, 'Paul Houle' <ontology2@gmail.com>, semantic-web@w3.org, "'Discussion list for the Wikidata project.'" <wikidata-l@lists.wikimedia.org>
Message-ID: <55D782D0.2040500@dbooth.org>
Hi Hans,

You can (with high probability) generate unique URIs that way, but part 
of the reason for generating them in a predictable way is ensure that if 
the same or only slightly modified data is regenerated later, the URI 
for Alice's address will still be the same.  This is helpful because you 
might already have other data that uses that URI, and you might merge 
new and old data.  When merging, it is helpful if the node URIs match. 
Otherwise you wind up with (effectively) duplicate triples. 
Incidentally, this is also one of the reasons that blank nodes cause 
problems: they also result in (effectively) duplicate triples.

In short, I think there are multiple motivations for deriving URIs from 
natural keys and properties:

  - They can be mnemonic and hackable, which makes them friendlier for 
data browsing and debugging.

  - They are predictable, which is better for data merging.

  - They are SPARQL friendly (as compared with blank nodes).

David Booth

On 08/21/2015 03:18 PM, Hans Teijgeler wrote:
> David,
>
> What about using Defining N-ary Relations on the Semantic Web
> <http://www.w3.org/TR/swbp-n-aryRelations/> Use Case 3 ?
>
> @prefix : <http://example/mine/address/> .
> @prefix ex: <http://example/other/> .
> @prefix v: <https://tools.ietf.org/html/rfc6350#section-10.2.6>
>
> :T78F7E2EB48C044CCB5474C072F5573DB rdf:type :SnailMailAddress,
> owl:Thing; # T+GUID
>      v:SOURCE ex:alice ;
>      v:FN "Alice Doe" ;
>      v:ADR "45 Park St, Shadyville, USA" .
>
> Something like that.
>
> Regards, Hans
>
> Hans Teijgeler,
> Laanweg 28,
> 1871 BJ Schoorl,
> Netherlands
> 15926.org <http://15926.org>
>
> -----Original Message-----
> From: David Booth [mailto:david@dbooth.org]
> Sent: vrijdag 21 augustus 2015 17:31
> To: Paul Houle; semantic-web@w3.org; Discussion list for the Wikidata
> project.
> Subject: Re: Naming Conventions for URIs
>
> On 08/20/2015 11:36 AM, Paul Houle wrote:
> [ . . . ]
>  > The production for a QName  cannot begin with a number so it is not
>  > correct to write something like
>  >
>  > dbpedia:100
>  >
>  > or expect to have the full URI squashed to that.  This kind of gotcha
>  > will drive newbies nuts,  and the realization of RDF as a universal
>  > solvent requires squashing many of them.
>
> I agree.  And although (as Andy pointed out) this particular issue has
> been fixed in SPARQL 1.1 and Turtle 1.1, last I checked not all tools
> had been upgraded to those specs, so it still remains a problem in
> practice.  Personally, as a workaround when converting a natural key to
> a URI, I always prefix the key with an alpha string that suggests what
> it is.  For example instead of the above I might write:
>
> dbpedia:dbp_100
>
> Another place that URI creation and management causes unnecessary
> headache is when you want to mint a new URI that is relative to an
> existing URI that you don't control.  For example, suppose someone else
> gives me a URI for Alice:
>
>     <http://example/other/alice> a foaf:Person .
>
> My software wants to add some triples about Alice's address, such as:
>
>     <http://example/other/alice> v:address _:b
>     _:b v:street "Park St" .
>     _:b v:city "Shadyville" .
>
> For SPARQL and other reasons it would be better to use a URI like
> <http://example/other/alice/address> instead of a blank node _:b for
> Alice's address, such as:
>
> <http://example/other/alice>
> v:address <http://example/other/alice/address> .
>     <http://example/other/alice/address> v:street "Park St" .
>     <http://example/other/alice/address> v:city "Shadyville" .
>
> But since I do not control Alice's URI, I cannot safely mint a URI that
> is relative to Alice's URI without the consent of the original URI's
> owner.  (Otherwise I would be URI squatting, which is bad practice and
> risky.)
>
> Intuitively I would like the new URI to be somehow derived from Alice's
> URI.  I could mint a URI from my own URI space <http://example/mine/>,
> by concatenating Alice's URI (after escaping) onto mine, such as:
>
>     <http://example/mine/address#http://example/other/alice>
>
> but now I don't have an easy PREFIXed way to write that URI in Turtle or
> SPARQL.  :(  Furthermore, I have to be careful to properly escape the
> original URI before concatenating it, because it might contain
> characters that are not allowed in a fragment identifier.
>
> In my experience URI allocation and management is one of the most
> annoying practical aspects of working with RDF, as compared with JSON,
> for example, where one can just put blinders on and ignore the need for
> global identification.  It would be great if some PhD student or other
> creative person could figure out a good solution to reduce this pain and
> make RDF easier for a wider audience to use.  (New conventions?  A new
> RDF serialization?  Extend RDF?)  JSON-LD is a good step, but still
> doesn't solve the problem.
>
> David Booth
>
Received on Friday, 21 August 2015 19:58:38 UTC