Re: Comments on Use Case for discussion on telcon from Harry Halpin on 2010-04-27 (public-rdb2rdf-wg@w3.org from April 2010)

From: Harry Halpin <hhalpin@w3.org>
Date: Tue, 27 Apr 2010 05:22:47 +0100 (BST)
To: "Juan Sequeda" <juanfederico@gmail.com>
Cc: public-rdb2rdf-wg@w3.org
Message-ID: <1a10f1de0bb8e3b14e475874b31f7769.squirrel@webmail-mit.w3.org>
> I see two options
>
>    1. Have a long list of use cases that show RDB in RDF
>    2. Present specific scenarios and identify which use case belongs to
> the
>    scenario.
>
>
> I believe that deciding which of the two options depends on the audience
> of
> the use case document. If the the audience is going to be mainly semantic
> web, whose main interest is to publish their RDB data as RDF, then we
> should
> go with the first option (long list of use cases). However, if we are
> trying
> to reach a broader audience (database community), I suggest that we go
> with
> the second option, because listing use cases that only show how the RDF
> from
> a RDB is going to be exposed doesn't really offer a motivation for a
> broader
> audience (If you believe I am wrong, please let me know).
>
> Going through the minutes last week [1] , I see the following
>
> mhausenblas: 2 issues: structured usecases
>
> … flat model with all use cases
>
> <juansequeda> +1 mhausenblas We need to choose one
>
> and then
>
> mhausenblas: we go ahead with the flat format
>
> I believe this was a rush decision and I don't think we actually decided
> as
> a group. Therefore, I would like that in this meeting we make a formal
> decision if we should go with flat use case or structured use case.
>
> Earlier this week I sent an email about the sub-use cases that we should
> have [2]. Angela and Michael replied saying that they agreed, hence my
> proposal to please revisit this decision of having either a flat or
> structured use case document.
>
> My proposal for the structured use case is the following (and I am
> copy/paste some stuff that I sent out in [2])
>
> The starting point is the following:
>
>    - Source: RDB (obviously)
>    - Destination: a data source in RDF.
>       - RDF that comes from structured source (RDB, XLS, CSV, etc)
>       - Existing RDF that is on the Web
>       - RDF that comes from unstructured sources (HTML, PDF, etc)
>
>
> And these are the following scenarios (stories or use cases):
> *Scenario 1.* I want to integrate my RDB with another structured
> source (RDB, XLS, CSV, etc), so I'll convert my RDB to RDF and assume
> my other structured source can also be in RDF.
>
>     Possible use case: Patient Recruitment, Integrate Enterprise RDB
> for tax control
>
>
> *Scenario 2. *I want to integrate my RDB with existing RDF on the web
> (linked data), so I'll convert my RDB to RDF and then I'm able to link
> and integrate
>
>     Possible use case: Wordpress, RNA Database, Patient Recruitment
>
>
>
> *Scenario 3. *I want to integrate my RDB with unstructured data (HTML,
> PDF, etc), so I'll convert my RDB to RDF and assume my other
> unstructured source can also be in RDF.
>
>     Possible use case: Patient Recruitment???, ???
>
>
> *Scenario 4.* I'm not interested in integrating my RDB with other
> sources (structured, rdf, unstructured). However, I do want to expose
> my RDB as RDF because I want semantic web search engines that crawl
> RDF data to index me and I want to become a Linked Data hub and let
> other people link to me.
>
>     Possible use case: wordpress, RNA Database, Patient Recruitment???
>
> Essentially points 1-3 are about integrating RDB with RDF. Point 4 is
> about just exposing it.
>
>
> (does anybody think I am missing any other scenario?)
>
>
> These are the following use cases in the doc [3] and will put them in
> one or many scenarios.
>
>
> *Patient Recruitment: *Talking today with Eric, he said that this use
> case can fit in several scenarios, if not all. Looking at the current
> text, it states "While there are many motivations for providing a
> common interface to administratively distinct databases", which leads
> me to put this in Scenario 1. However, if this use case fits in the
> other scenarios, there should be a text explaining this.
>
>
> *Web Applications - Wordpress Blog:* I believe this use case is in
> Scenario 4. The semantic web needs data, and most of the data is in
> RDB. Therefore for the semantic web to become a reality, we need more
> RDF. This use case demonstrates this. In addition, this may fit
> Scenario 2 if the Wordpress blog data would like to be linked to other
> linked data sources. I propose the following text:
>
>
> *Users of popular web applications backed by relational databases such
> as blogs, wikis, e-commerce websites would like to expose their
> relational content as RDF on the web in order that semantic web search
> engines can index blog posts, offerings, etc. For this purpose, widely
> adopted domain ontologies such as FOAF, SIOC, Dublic Core,
> GoodRelations should be mapped to the relational database.*
>
>
> *Integrating Enterprise Relational Databases for tax control:* I think
> this use case clearly represents Scenario 1. I propose the following
> text
>
>
> *Trentino is an autonomous region in the north of Italy with a
> population of 1 million and more than 200 municipalities. Each
> municipality has data about people, organizations, building etc in
> their individual relational database. The goal is to integrate
> heterogenous relational databases and offer the user, a tax agent, an
> intelligent tool for navigating through the data present in the many
> different databases. The tool aggregates data and creates a profile
> for each tax payer. Each user profile shows different type of
> information, with links to other entities such as the buildings owner,
> payments made, location of residence, etc.*
>
> *Each relational database can be mapped to an ontology that describes
> the domain in course. Queries can then be executed on the domain
> ontology and then translated to specific queries on each relational
> database.*
>
>
> *RNA Database**:* I put this use case in scenario 2. There is existing
> linked data about proteins, and we would like to expose our rdb data
> as linked data so others can point to us and because we want to point
> to others. I propose the following text:
>
>
> *Rob, from the RNA lab would like to integrate the data from his RNA
> Comparative Analysis Database (rCAD) with other databases such as
> GenBank, PDB, etc. This other databases are now exposed as RDF on the
> web following the Linked Data principles. Therefore Rob would also
> like to expose rCAD as RDF on the web following the Linked Data
> principles in order to link the data from rCAD with the data in
> GenBank, PDB, etc. Eventually he will be able to execute SPARQL
> queries on the web that will return results from different data
> sources.*
>
> *In order to expose rCAD as RDF, Rob wants to map rCAD to an existing
> domain ontology called the Multiple Alignment Ontology (MAO).*
>
>
> The other two use cases, Exposing many-to-many join tables as simple
> triples and Value based type specification, I honestly do not see them
> as use cases, instead as a motivation for requirements.
>
>
> Ok.. so this is my proposal. I honestly do not see how presenting
> these scenarios and classifying the use cases in these scenarios will
> affect the how the audience understands the RDB2RDF issue. I believe
> that with a little bit of structure, this can be more understandable
> for a wider audience.
>

Note there are two *distinct* proposals in this e-mail. One is for a 4-way
meta-structuring of our 6 use-cases, and the other is for adding some text
to the existing cases. We should formally consider these separately
tomorrow.

>
> [1] http://www.w3.org/2010/04/20-rdb2rdf-minutes.html
> [2]
> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Apr/0058.html
> [3] http://www.w3.org/2001/sw/rdb2rdf/use-cases/#uc
>
Received on Tuesday, 27 April 2010 04:22:48 UTC