Re: [ACTION-331] Mappings from existing data models vs defining a new data modelling language from Michael Kifer on 2007-09-26 (public-rif-wg@w3.org from September 2007)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Wed, 26 Sep 2007 13:48:41 -0400
To: Christian de Sainte Marie <csma@ilog.fr>
Cc: RIF WG <public-rif-wg@w3.org>
Message-ID: <8111.1190828921@cs.sunysb.edu>
Let me try to give a more elaborate and realistic scenario.

Parties have data to exchange, but that data is stored in some complex
schema and they only want to exchange some part of it and in a transformed
form (i.e., a view, but not necessarily a database view, since recursion
might be involved etc.).

The most straightforward approach is to create the view schema in RIF
schema language and then map the data onto that schema using rif rules.

Now, regarding your description of the approach, I find it unclear.

... The question is now: should RIF define a data schema language ...

Then

... The benefit would be to avoid one step in the translation from/to RIF ...

Benefit of what? Of a RIF schema language? I do not understand which
beneficiary you are talking about. Then you talk about schema languages,
like XML. How will RIF work with XML Schema? Another layer of
specification?

Then
  .... In the case where no shared schema exist, the parties will have to 
specify one and, given that RIF will provide a mapping to the main 
schema languages, they will have little motivation to choose to specify 
that schema directly in RIF.

This is quite murky. RIF will provide mappings from which schema languages?
The unknown ones that the parties agree on? Yet another level of RIF
specification?

So, I find your conclusions unsupported, and it seems to me that you are
adding to the workload, not taking away.


	--michael  

> All,
> 
> In completion of my action-331, Ii will try to clarify the the issue on 
> mappings from existing data models vs defining a new data modelling 
> language.
> 
> I will use the term "data schema" (as in XML schema, relational data 
> base schema etc), in an attempt to avoid the confusion between the RIF 
> data model and application specific instances of data schemas.
> 
> I assume that the data to which the interchanged rules apply are not 
> included in the RIF document, the general cases being that either each 
> party uses the interchanged rules with its own data, a common data 
> source is shared by different means (e.g. Web service acces), or a data 
> document is interchanged separately (the reason for interchanging the 
> rules and data separately, in the latter case, being to separate concerns).
> 
> The problem is thus to make sure that everybody apply the rules to the 
> data in a consistent way, that is, that, given the same dataset, every 
> consumer of a RIF document will give the same interpretation to the 
> rules and to their parts (terms, litterals etc).
> 
> The solution is that the parties in an interchange must agree on a 
> common data schema and on how to interpret it (in the sense that a data 
> schema defines an application's  vocabulary -terms, relations etc- and 
> the datasets provide the interpretations).
> 
> Based on that agreement, each party knows how to map the data schema 
> onto their own data structure (or onto the shared data structure), and 
> is able to apply the rules to the data in a consistent way, provided a 
> fixed mapping between the schema language used to specify the agreed on 
> data schema and the RIF data model.
> 
> The question is now: should RIF define a data schema language, so that 
> parties in an interchange can use RIF to specify the common data schema 
> they agree on?
> 
> The benefit would be to avoid one step in the translation from/to RIF 
> to/from one's own rule language, as the data model would map onto each 
> other without an intermediary.
> 
> Without entering a theoretical discussion, nor a discussion about 
> principles, scope and, the main drawback of this approach would be 
> practical: in many cases where rules need be interchanged, if not most, 
> the parties must interchange data for other reasons as well, or use them 
> in a consistent way for other reason than consistent interpretation of 
> rules, and thus need a shared data schema independently of their use of 
> RIF for interching rules.
> 
> Many such shared data schemas already exist, and Web languages have been 
> designed specifically for the purpose of specifying (e.g. XML schema). 
> In these cases, the user will be reluctant to redefine the shared schema 
> in a different schema language, so that RIF will have to provide a 
> mapping from its data model onto the main Schema languages anyway.
> 
> In the case where no shared schema exist, the parties will have to 
> specify one and, given that RIF will provide a mapping to the main 
> schema languages, they will have little motivation to choose to specify 
> that schema directly in RIF.
> 
> As a conclusion, I think that specifying a schema language within RIF is 
> adding an unnecessary burden to our already heavy workload, and that we 
> should focus instead on specifying how the RIF data model maps onto 
> existing and widely deployed data schema languages.
> 
> See you in Hawthorne.
> 
> Christian
> 
> 
>
Received on Wednesday, 26 September 2007 17:48:53 UTC