- From: Calogero Alex Baldacchino <alex.baldacchino@email.it>
- Date: Wed, 04 Feb 2009 04:16:26 +0100
Charles McCathieNevile ha scritto: > On Fri, 09 Jan 2009 12:54:08 +1100, Calogero Alex Baldacchino > <alex.baldacchino at email.it> wrote: > >> I admit I'm not very expert in RDF use, thus I have a few questions. >> Specifically, maybe I can guess the advantages when using the same >> (carefully modelled, and well-known) vocabulary/ies; but when two >> organizations develop their own vocabularies, similar yet different, >> to model the same kind of informations, is merging of data enough? >> Can a processor give more than a collection of triples, to be then >> interpreted basing on knowledge on the used vocabulary/ies? > > RDF consists of several parts. One of the key parts explains how to > make an RDF vocabulary self-describing in terms of other vocabularies. > >> I mean, I assume my tools can extract RDF(a) data from whatever >> document, but my query interface is based on my own vocabulary: when >> I merge informations from an external vocabulary, do I need to >> translate one vocabulary to the other (or at least to modify the >> query backend, so that certain curies are recognized as representing >> the same concepts - e.g. to tell my software that 'foaf:name' and >> 'ex:someone' are equivalent, for my purposes)? If so, merging data >> might be the minor part of the work I need to do, with respect to >> non-RDF(a) metadata (that is, I'd have tools to extract and merge >> data anyway, and once I translated external metadata to my format, I >> could use my own tools to merge data), specially if the same model is >> used both by mine and an external organization (therefore requiring >> an easier translation). > > If a vocabulary is described, then you can do an automated translation > from one RDF vocabulary to another by using your original query based > in your original vocabulary. This is one of the strengths of RDF. > Certainly, this is a strong benefit. However, when comparing different vocabularies in depth to their basic description (if any), I guess there may be a chance to find vocabularies which are not described in terms of each other, or of a third common vocabulary, thus a translation might be needed anyway. This might be true for small-time users developing a vocabulary for internal use before starting an external partnership, or regardless of the partnership (sometimes, small-time users may find it easier/faster to "reinvent the wheel" and modify it to address evolving problems; potentially someone might be unable to afford an extensive investigation to find an existing vocabulary fulfilling his requirments, or to develope a new one in conjunction with a partner having similar but slightly different needs, and thus potentially leading to a longer process to mediate respective needs. In such a case, I wouldn't expect that such a person will look for existing, more generic vocabularies which can describe the new one in order to ensure the widest possible interchange of data - that is, until a requirement for interchange arises, designing a vocabulary for that might be an overengineered task, and once the requirement was met, addressing it with a translation or with a description in term of a vocabulary known to be involved (each time the problem recurres) might be easier/faster than engineering a good description once and for all). Anyway, let's assume we're going to deal with well-described vocabularies. Is the automated translation a task of a parser/processor creating a graph of triples, or a task of a query backend? And what are the requirements for a UA, from this perspective? Must it just parse the triples and create a graph or also take care of a vocabulary description? Must it be a complete query backend? Must it also provide a query interface? How much basic or advanced must the interface be? I think we should answer questions like this, and try and figure out possible problems arising with each answer and possible related solutions, because the concern here should be what UAs must do with RDF embedded in a non-RDF (and non-XML) document. >> Thus, I'm thinking the most valuable benefit of using RDF/RDFa is >> the sureness that both parties are using the very same data model, >> despite the possible use of different vocabularies -- it seems to me >> that the concept of triples consisting of a subject, a predicate and >> an object is somehow similar to a many-to-many association in a >> database, whereas one might prefer a one-to-many approach - though, >> the former might be a natural choice to model data which are usually >> sparse, as in a document prose. > > I don't see the ananlogy, but yes, I think the big benefit is being > able to ensure that you know the data model without knowing the > vocabulary a priori - since this is sufficient to automate the process > of merging data into your model. > I understand the benefit with respect to well-known and/or well-described vocabularies, but I wonder if an average small-time user would produce a well-described or a very-custom vocabulary. In the latter case, a good knowledge of a foreing vocabulary should be needed before querying it and I guess the translation can't be automated, but requires an understanding level which might be close to the one needed to translate from a (more or less) different model. In this case, the benefit of an automated merging of data from similar models might be lost in front of a non-automated translation which might be as difficult as translating from different models (but with a sufficient verbal documentation - that is with a natural language description, which should be easier to produce than a code-level description), given that translated data should be easy to merge. I'm pushing this concept because I think it should be clear what scenario is more likely to happen, to avoid to introduce features perfectly designed for the same people who can develop a "perfect" vocabulary with a "perfect" generic description, and I suppose to be the same who can afford to develop a generic toolkit on their own, or to adjust an existing one (thus, they might be pleased with a basic support and a basic API), but not for most small-time users, who might develop a custom vocabulary the same way they develop a custom model, thus needing more custom tools (again, a basic support and a basic API might satisfay their needs, more than a complete backend working fine with well-described vocabularies but not with completely unknown ones, thus requiring a custom developement anyway). Assuming this is true, there should be an evidence that the same people who'd produce a "bad" vocabulary do not prefer a completely custom model, because, if they were the great majority, we would risk to invest resources (on the UAs side, if we made of it a general requirement) to help people who may be pleased with the help, but not really need it (because they're not small-time users maybe, and can do it on their own without too much effort -- this doesn't mean that their requirements are less significant and worth to be taken into account, but in general UA developers might not be very happy to invest their resources to implement something which is or appear overengineered with respect to the real needs "in the wild", thus we should carefully establish how strong is the need to support RDFa and accurately define support requirements for UAs). > cheers > > Chaals > WBR, Alex -- Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f Sponsor: Blu American Express: gratuita a vita! Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8613&d=4-2
Received on Tuesday, 3 February 2009 19:16:26 UTC