More comments on use case and req doc from Juan Sequeda on 2010-05-22 (public-rdb2rdf-wg@w3.org from May 2010)

From: Juan Sequeda <juanfederico@gmail.com>
Date: Sat, 22 May 2010 09:42:18 -0500
To: public-rdb2rdf-wg@w3.org
Message-ID: <AANLkTim14zJJIVHxevgo82xxEAsfJ0Rtl8X0h0JSbadV@mail.gmail.com>
1 Introduction

·       the Resource Description Framework (RDF) is used --> drop “the”

·       … use cases and requirements for a relational to RDF mapping --> should
be “relational database to RDF mapping”

1.4 Glossary

I would add the following terms:

·       Local Ontology: an ontology that has been derived from the
relational schema

·       Domain Ontology: an ontology that has been developed by experts in
the domain and accepted by a community (i.e. FOAF, SIOC, Gene Ontology, etc)

2.1 UC1 – Patient Recruitment

·       Paragraph 1, Sentence 2: drop “and an equivalent SQL query”

·       Do we really need 6 tables? Each table isn’t adding anything new. I
would suggest taking out at least 3.

·       Paragraph 2, Sentence 1: the term “data structures” shows up. This
term is used three times in the whole document and each time it has a
different meaning. If I understand correctly, in this section, data
structure means an ontology. If I recall, Eric told me that he extracted the
ontology from a xml schema for HL7/RIM and CDISK SDTM (whatever that is
suppose to mean). Therefore I suggest to change the sentence to the
following: “Accompanying each table are two RDF views (represented in
Turtle) corresponding to the HL7/RIM and CDISK SDFTM ontology in RDFS.”

·       Paragraph 2, Sentence 2: administratively --> administer

2.2 UC2 – Web Applications

·       Paragraph 1, Sentence 3: map the relational data structures … --> map
the relational data and schema

2.3 UC3

Rewrite the initial part to the following (this is basically putting
everything together without the headers)

The goal of this use case is to integrate relational databases and expose
them on the web or a intranet through the use of unique identifiers. This
approach consists of integrating and interlinking data about entities on
different databases.

This use case is a pilot project for the Trentino region tax agency.
Trentino is an autonomous region in the north of Italy. The region has a
population of 1 million and more than 200 municipalities with their own
information systems. The goal is to integrate and link tax related data
about people, organizations, buildings etc. This data come from different
databases especially from the region’s many municipalities, each with their
own individual schemas. With our methodology we will provide a lightweight
method for aggregating the data. In this way we are providing the user, a
tax agent in our case, an intelligent tool for navigating through the data
present in the many different databases. The tool aggregates data and
creates a profile for each tax payer. Each user profile shows different type
of information, with links to other entities such as the buildings owned,
payments made, location of residence etc.

3.1  Approaches

·       Relation structure --> relational schema

3.2 Database to Ontology Mapping

·       The title and content is mixed up with 3.3. The title and  content
for 3.3 actually is the one for 3.2

4.1.1 DIRECT – Direct Mapping

Comment: This section is still very confusing. Reading this you think that a
relation graph can only have edges from foreign/primary keys. But then you
realize in the second paragraph that attributes can also be part of the
relational graph. Edges can’t be expressed as RDF triples; it is the
predicate of the triple. I suggest to completely rewrite this section to the
following (however, I leave this to discussion):

Relational schema and data are a potentially cyclic graph where nodes are
tables or tuples and edges are either foreign/primary key relationships or
table attributes.  This relational graph can be directly mapped to a RDF
graph where the nodes of the relational graph correspond to the subject or
predicate and the edges of the relational graph correspond to the predicate.
This directly mapped RDF graph represents exactly the information in the
relational database. The relational schema can be directly mapped to a
RDFS/OWL ontology while the relational data is directly mapped to a RDF
graph, which is an instance of the RDFS/OWL ontology. This ontology is
considered the local ontology. This RDFS/OWL ontoogy can be used when it is
desired to let the database schema determine the effective ontology of the
RDF view. An example of direct mapping is shown in Section 3.1

A minimal configuration MUST provide a (virtual) RDF graph representing the
attributes and relationships between tuples in the relational database.

Note: I would suggest dropping the two images.

4.1.2 Transform

Eliminate 4.1.2.1 header and leave that content as part of 4.1.2

I suggest rewriting to the following (however, open for discussion):

It is good Semantic Web practice to re-use existing domain ontologies.
Mapping between the relational graph or the local ontology with a domain
ontology usually requires graph transformations. An example of this
transform mapping is shown in Section 3.2. The local ontology considers the
teacher classification (Math, Physics, etc) as literal values while in the
domain ontology the teacher classifications are RDFS/OWL classes.

4.1.2.2 LABELGEN change to 4.1.3

4.1.2.3 DATATYPES change to 4.1.4


Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com
Received on Saturday, 22 May 2010 14:42:51 UTC