Comments on Use Case for discussion on telcon

I see two options

   1. Have a long list of use cases that show RDB in RDF
   2. Present specific scenarios and identify which use case belongs to the
   scenario.


I believe that deciding which of the two options depends on the audience of
the use case document. If the the audience is going to be mainly semantic
web, whose main interest is to publish their RDB data as RDF, then we should
go with the first option (long list of use cases). However, if we are trying
to reach a broader audience (database community), I suggest that we go with
the second option, because listing use cases that only show how the RDF from
a RDB is going to be exposed doesn't really offer a motivation for a broader
audience (If you believe I am wrong, please let me know).

Going through the minutes last week [1] , I see the following

mhausenblas: 2 issues: structured usecases

… flat model with all use cases

<juansequeda> +1 mhausenblas We need to choose one

and then

mhausenblas: we go ahead with the flat format

I believe this was a rush decision and I don't think we actually decided as
a group. Therefore, I would like that in this meeting we make a formal
decision if we should go with flat use case or structured use case.

Earlier this week I sent an email about the sub-use cases that we should
have [2]. Angela and Michael replied saying that they agreed, hence my
proposal to please revisit this decision of having either a flat or
structured use case document.

My proposal for the structured use case is the following (and I am
copy/paste some stuff that I sent out in [2])

The starting point is the following:

   - Source: RDB (obviously)
   - Destination: a data source in RDF.
      - RDF that comes from structured source (RDB, XLS, CSV, etc)
      - Existing RDF that is on the Web
      - RDF that comes from unstructured sources (HTML, PDF, etc)


And these are the following scenarios (stories or use cases):
*Scenario 1.* I want to integrate my RDB with another structured
source (RDB, XLS, CSV, etc), so I'll convert my RDB to RDF and assume
my other structured source can also be in RDF.

    Possible use case: Patient Recruitment, Integrate Enterprise RDB
for tax control


*Scenario 2. *I want to integrate my RDB with existing RDF on the web
(linked data), so I'll convert my RDB to RDF and then I'm able to link
and integrate

    Possible use case: Wordpress, RNA Database, Patient Recruitment



*Scenario 3. *I want to integrate my RDB with unstructured data (HTML,
PDF, etc), so I'll convert my RDB to RDF and assume my other
unstructured source can also be in RDF.

    Possible use case: Patient Recruitment???, ???


*Scenario 4.* I'm not interested in integrating my RDB with other
sources (structured, rdf, unstructured). However, I do want to expose
my RDB as RDF because I want semantic web search engines that crawl
RDF data to index me and I want to become a Linked Data hub and let
other people link to me.

    Possible use case: wordpress, RNA Database, Patient Recruitment???

Essentially points 1-3 are about integrating RDB with RDF. Point 4 is
about just exposing it.


(does anybody think I am missing any other scenario?)


These are the following use cases in the doc [3] and will put them in
one or many scenarios.


*Patient Recruitment: *Talking today with Eric, he said that this use
case can fit in several scenarios, if not all. Looking at the current
text, it states "While there are many motivations for providing a
common interface to administratively distinct databases", which leads
me to put this in Scenario 1. However, if this use case fits in the
other scenarios, there should be a text explaining this.


*Web Applications - Wordpress Blog:* I believe this use case is in
Scenario 4. The semantic web needs data, and most of the data is in
RDB. Therefore for the semantic web to become a reality, we need more
RDF. This use case demonstrates this. In addition, this may fit
Scenario 2 if the Wordpress blog data would like to be linked to other
linked data sources. I propose the following text:


*Users of popular web applications backed by relational databases such
as blogs, wikis, e-commerce websites would like to expose their
relational content as RDF on the web in order that semantic web search
engines can index blog posts, offerings, etc. For this purpose, widely
adopted domain ontologies such as FOAF, SIOC, Dublic Core,
GoodRelations should be mapped to the relational database.*


*Integrating Enterprise Relational Databases for tax control:* I think
this use case clearly represents Scenario 1. I propose the following
text


*Trentino is an autonomous region in the north of Italy with a
population of 1 million and more than 200 municipalities. Each
municipality has data about people, organizations, building etc in
their individual relational database. The goal is to integrate
heterogenous relational databases and offer the user, a tax agent, an
intelligent tool for navigating through the data present in the many
different databases. The tool aggregates data and creates a profile
for each tax payer. Each user profile shows different type of
information, with links to other entities such as the buildings owner,
payments made, location of residence, etc.*

*Each relational database can be mapped to an ontology that describes
the domain in course. Queries can then be executed on the domain
ontology and then translated to specific queries on each relational
database.*


*RNA Database**:* I put this use case in scenario 2. There is existing
linked data about proteins, and we would like to expose our rdb data
as linked data so others can point to us and because we want to point
to others. I propose the following text:


*Rob, from the RNA lab would like to integrate the data from his RNA
Comparative Analysis Database (rCAD) with other databases such as
GenBank, PDB, etc. This other databases are now exposed as RDF on the
web following the Linked Data principles. Therefore Rob would also
like to expose rCAD as RDF on the web following the Linked Data
principles in order to link the data from rCAD with the data in
GenBank, PDB, etc. Eventually he will be able to execute SPARQL
queries on the web that will return results from different data
sources.*

*In order to expose rCAD as RDF, Rob wants to map rCAD to an existing
domain ontology called the Multiple Alignment Ontology (MAO).*


The other two use cases, Exposing many-to-many join tables as simple
triples and Value based type specification, I honestly do not see them
as use cases, instead as a motivation for requirements.


Ok.. so this is my proposal. I honestly do not see how presenting
these scenarios and classifying the use cases in these scenarios will
affect the how the audience understands the RDB2RDF issue. I believe
that with a little bit of structure, this can be more understandable
for a wider audience.


[1] http://www.w3.org/2010/04/20-rdb2rdf-minutes.html
[2] http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Apr/0058.html
[3] http://www.w3.org/2001/sw/rdb2rdf/use-cases/#uc

Received on Tuesday, 27 April 2010 00:27:39 UTC