Re: Minting URIs: how to deal with unknown data structures

Hello,

Thank you all for your responses and the wealth of advice and 
information. Lots of interesting reading material, and introductions to 
problems I was not aware of yet too :-).

I am more at ease about the minting problem now. I think I was looking 
for some kind of well-defined method of minting all present and future 
URIs. But then I came to realize that it is not the purpose of the URI 
to convey information, it is just a pointer to information. 
Specifically, it was this sentence from the article about REST and 
Linked Data (http://ws-rest.org/2011/proc/a5-page.pdf) that enlightened me:

“A common misapplication of both approaches is to assume semantics (or 
abuse implied semantics) encoded in a URI, when both REST and Linked 
Data explicitly expect clients to regard URIs as opaque strings when 
used for identification.”

So if my future URIs look a bit different from my present URIs because 
they are produced by another method, that should not be a problem. This 
means I can now focus on getting the URIs right for the data that I want 
to publish now, and that I don't need to plan ahead for the future. That 
is a relief.

Regards,
Frans



On 2011-04-15 14:48, Frans Knibbe wrote:
> Hello,
>
> Some newbie questions here...
>
> I have recently come in contact with the concept of Linked Data and I 
> have become enthusiastic. I would like to promote the idea within my 
> company (we specialize is geographical data) and within my country. I 
> have read the excellent Linked Data book (“Linked Data: Evolving the 
> Web into a Global Data Space”) and I think I am almost ready to start 
> publishing Linked Data. I understand that it is important to get the 
> URIs right, and not have to change them later. That is what my 
> questions are about.
>
> I have acquired the first part (authority) of my URIs, let's say it is 
> lod.mycompany.com. Now I am faced with the question: How do I come up 
> with a URI scheme that will stand the test of time? I think I will 
> start with publishing some FOAF data of myself and co-workers. And 
> then hopefully more and more data will follow. At this moment I can 
> not possible imagine which types of data we will publish. They are 
> likely to have some kind of geographical component, but that is true 
> for a lot of data. I believe it is not possible to come up with any 
> hierarchical structure that will accommodate all types of data that 
> might ever be published.
>
> So I think it is best to leave out any indication of data organization 
> in the path element of the URI (i.e. http://lod.mycompany.com/people 
> is a bad idea). In my understanding, I could use base URIs like 
> http://lod.mycompany.com/resource, http://lod.mycompany.com/page and 
> hhtp://lod.mycompany.com.data, and then use unique identifiers for all 
> the things I want to publish something about. If I understand 
> correctly, I don't need the URI to describe the hierarchy of my data 
> because all Linked Data are self-describing. Nice.
>
> But then I am faced with the problem: What method do I use to mint my 
> identifiers? Those identifiers need to be unique. Should I use a 
> number sequence, or a hash function? In those cases the URIs would be 
> uniform and give no indication of the type of data. But a number 
> sequence seems unsafe, and in the case of a hash function I would 
> still need to make some kind of structured choice of input values.
>
> I would welcome any advice on this topic from people who have had some 
> more experience with publishing Linked Data.
>
> Regards,
> Frans Knibbe
>
>
>
>
>

Received on Monday, 18 April 2011 15:29:26 UTC