- From: Alexander Johannesen <alexander.johannesen@gmail.com>
- Date: Thu, 9 Jun 2011 09:15:58 +1000
- To: Alexander Johannesen <alexander.johannesen@gmail.com>, semantic-web@w3.org, adrien.dimascio@logilab.fr
- Cc: adim@logilab.fr
Hiya, Ok, I think I understand what you're trying to do. Here's my .2 AUD ; Nicolas Chauvat <nicolas.chauvat@logilab.fr> wrote: > http://data.thelibrary.com/1234/victor_hugo is the url of a document > that describes the person, describes its work, and links to other > documents that provide detailed information about the two. The first thing that strikes me is that you've created an id scheme that will rather easily break, and the added human notion of a readable form does not, in fact, do what you want, and a few other examples will point this out ; http://data.thelibrary.com/1234/victor_hugo http://data.thelibrary.com/1234/hugo http://data.thelibrary.com/1234/frank_herbert http://data.thelibrary.com/2345/victor_hugo http://data.thelibrary.com/1234/victor_marie_hugo Which one is the correct one? And more importantly, why? Not "because we say so", but technically. :) What will happen at each instance? How will you deal with the discrepancies? And notice that each of these are their own identifier, their own URI, and there's nothing in the description of "more readable" that denotes this kind of use. In other words, there are canonical URIs that have a similarity to other URIs, but it's not clear which are canonical, and I suspect, for me, the biggest problem is that you're already using an identifier <id> which seems unambiguous, and then you slap an added identifier on top which may break the first, or create ambiguity. > http://data.thelibrary.com/1234#foaf:Person is the url of the person > itself. Ok, I'm allergic to identifiers using anchors (fragment identifiers), and the most important reason is that the anchor is only part of the URI for the client, not the server. For you on the server side, the following URIs are the *same* ; http://data.thelibrary.com/1234 http://data.thelibrary.com/1234#foaf:Person ... unless when technology ignores standards, and that's not a practice I can recommend. In a web app that happens in the browser this is probably fine, but we're talking about identifiers here that should be the same on the client as well as the server. >> Are you doing the full FRBR monty, or just a few select? > > A good part of it. Ok, I'm assuming groups 1 and 2 (but group 3 possibly in the future?). Then I'd suggest an URI scheme closer to ; http://data.thelibrary.com/person/1234 (canonical /person/ id) http://data.thelibrary.com/person/victor_hugo (alias, with redirect) Persistent identification shouldn't be reliant on text that could be subjective. Make your identifiers completely ambiguous, and everything else aliases of that. Then image that concept for the following URIs as well ; http://data.thelibrary.com/corporate_body/34vb5234785v http://data.thelibrary.com/work/4567 http://data.thelibrary.com/expression/3467345753 http://data.thelibrary.com/manifestation/345f908n345n340985n345 http://data.thelibrary.com/item/345v34v54 Typification is interesting when we build URI schemes, and I prefer to denote type in order to split of both load and help dis-ambiguity across the application, and then have further retain an internal structure of canonical id's to which all other id's are aliases ; http://data.thelibrary.com/id/4567 (canonical form for http://data.thelibrary.com/work/4567) >> What does readable mean? > > A url like <data>/1234 will not help you or me figure out what might > be the GET-able document about. You're getting into dangerous waters when you meld semantics and language constructs into an identification scheme, so I would do as explained above; have internal and external identifiers that are ambiguous, and create an alias scheme. (This is important for two reasons; future changes, and human stupidity.) > I know Victor Hugo is a french author, thus <data>/1234/victor_hugo > tells me I will GET a document about that person. Yes, but maybe only you know this, your naming scheme here adds only confusion to the identified item. And again note that you've already got an identifier in there. As pointed out at the top, what happens with these two ; <data>/1234/victor_hugo <data>/2345/victor_hugo The dis-ambiguity happens at a part of the identifier that is more important than the part that you have introduced, even if that one seems easier to deal with (to your subjective eyes). > >> > <data>/1234 redirects via HTTP 303 to <data>/1234/readable_name >> >> Why? > > Because cut-n-pasting a url like <data>/1234#foaf:Person into your > browser will get you a document describing that person. Sorry, I didn't understand this one. >> > <data>/1234/readable_name redirects via HTTP 301 to <data>/1234/readable_name/ >> >> Why? > > Because we want a single url for the generic document. > > Inspiration came from Apache that does a 301 when serving a directory: > somedir -> somedir/ -> chains with content negotiation. Yes, but remember why they do this, and where this came from. If you really want to have a method of making URIs denote better the uniformness of resources, go the other way instead. A trailing slash is a left-over (IMHO) from browsing structure *based* on the URI (which in my world is a no-no). The slash means something akin to 'children' or 'content' of the same URI without that slash (again, IMHO), but if you mean to identify a thing, you want to point to the thing, not its content or children. I'd prefer the canonical URI to be ; <data>/1234/readable_name rather than <data>/1234/readable_name/ > You are right the above does not come out of the blue, here is an > example before you start condemning and berating and wincing :) > > http://viaf.org/viaf/9847974/#Hugo,_Victor,_1802-1885 > http://viaf.org/viaf/9847974/rdf.xml Notice that the anchor here isn't used as an identifier, nor is the name or dates part of it, or the canonical short name form. There's the problem still of the trailing slash as part of the identifier (in the viaf.org system), but I guess people can argue about that one. :) > in the latter, you will read uris like > > http://libris.kb.se/resource/auth/206651#concept > http://viaf.org/viaf/sourceID/SELIBR%7C206651#skos:Concept > http://viaf.org/viaf/9847974/#foaf:Person > etc. So are you saying that because others are doing it wrong, that makes it ok for you to do it wrong, too? :) > There is also a lot to read at http://www.w3.org/2005/Incubator/lld/ > including a mailing-list on which I could post my question, but my > goal was to get feedback from the "general semantic web public" rather > than talk to people who have been working on exactly the same topic > for the past year. This is a good thing to do; ask around to as many people as you possibly can. I feel especially the library sector could do well with a) listening to outside parties in order to avoid the whole "invented somewhere else" syndrome, and b) share their expertise in their field with the rest of us which we sorely need. I hope what I've written above is taken in the spirit it was given, as from a person who's worked in both camps for many years and seen lots of library infra-structure rendered half of the potential it could have been. :) The funny part is that ILS already have good identification schemes (internal; they suck at external schemes, even LoC and OCLC) that should be externalized in a uniform way (starting with proper identifiers for global library institutions and their data sets ... there's some, but far from uniform nor extensive nor canonical), and these things are vital if libraries hope to merge and blend with other semantic data out there. Karen Coyle has done some great work with RDF-ing FRBR (can't remember any URIs) which you probably want to chase up for the backend modeling issues. I'd hope they could in fact take a lead in creating proper identification schemes and become a canonical governing body, but I suspect that boat has sailed, and that aspect is sadly missing from both FRBR and RDA body of works. Anyway, hope some of what I've said makes any sense. :) Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ ---------------------------------------------- ------------------ http://www.google.com/profiles/alexander.johannesen ---
Received on Wednesday, 8 June 2011 23:16:26 UTC