- From: Juan Sequeda <juanfederico@gmail.com>
- Date: Mon, 26 Apr 2010 20:27:06 -0400
- To: public-rdb2rdf-wg@w3.org
- Message-ID: <y2nf914914c1004261727x375b95d6qaf98e234f7f855f5@mail.gmail.com>
I see two options 1. Have a long list of use cases that show RDB in RDF 2. Present specific scenarios and identify which use case belongs to the scenario. I believe that deciding which of the two options depends on the audience of the use case document. If the the audience is going to be mainly semantic web, whose main interest is to publish their RDB data as RDF, then we should go with the first option (long list of use cases). However, if we are trying to reach a broader audience (database community), I suggest that we go with the second option, because listing use cases that only show how the RDF from a RDB is going to be exposed doesn't really offer a motivation for a broader audience (If you believe I am wrong, please let me know). Going through the minutes last week [1] , I see the following mhausenblas: 2 issues: structured usecases … flat model with all use cases <juansequeda> +1 mhausenblas We need to choose one and then mhausenblas: we go ahead with the flat format I believe this was a rush decision and I don't think we actually decided as a group. Therefore, I would like that in this meeting we make a formal decision if we should go with flat use case or structured use case. Earlier this week I sent an email about the sub-use cases that we should have [2]. Angela and Michael replied saying that they agreed, hence my proposal to please revisit this decision of having either a flat or structured use case document. My proposal for the structured use case is the following (and I am copy/paste some stuff that I sent out in [2]) The starting point is the following: - Source: RDB (obviously) - Destination: a data source in RDF. - RDF that comes from structured source (RDB, XLS, CSV, etc) - Existing RDF that is on the Web - RDF that comes from unstructured sources (HTML, PDF, etc) And these are the following scenarios (stories or use cases): *Scenario 1.* I want to integrate my RDB with another structured source (RDB, XLS, CSV, etc), so I'll convert my RDB to RDF and assume my other structured source can also be in RDF. Possible use case: Patient Recruitment, Integrate Enterprise RDB for tax control *Scenario 2. *I want to integrate my RDB with existing RDF on the web (linked data), so I'll convert my RDB to RDF and then I'm able to link and integrate Possible use case: Wordpress, RNA Database, Patient Recruitment *Scenario 3. *I want to integrate my RDB with unstructured data (HTML, PDF, etc), so I'll convert my RDB to RDF and assume my other unstructured source can also be in RDF. Possible use case: Patient Recruitment???, ??? *Scenario 4.* I'm not interested in integrating my RDB with other sources (structured, rdf, unstructured). However, I do want to expose my RDB as RDF because I want semantic web search engines that crawl RDF data to index me and I want to become a Linked Data hub and let other people link to me. Possible use case: wordpress, RNA Database, Patient Recruitment??? Essentially points 1-3 are about integrating RDB with RDF. Point 4 is about just exposing it. (does anybody think I am missing any other scenario?) These are the following use cases in the doc [3] and will put them in one or many scenarios. *Patient Recruitment: *Talking today with Eric, he said that this use case can fit in several scenarios, if not all. Looking at the current text, it states "While there are many motivations for providing a common interface to administratively distinct databases", which leads me to put this in Scenario 1. However, if this use case fits in the other scenarios, there should be a text explaining this. *Web Applications - Wordpress Blog:* I believe this use case is in Scenario 4. The semantic web needs data, and most of the data is in RDB. Therefore for the semantic web to become a reality, we need more RDF. This use case demonstrates this. In addition, this may fit Scenario 2 if the Wordpress blog data would like to be linked to other linked data sources. I propose the following text: *Users of popular web applications backed by relational databases such as blogs, wikis, e-commerce websites would like to expose their relational content as RDF on the web in order that semantic web search engines can index blog posts, offerings, etc. For this purpose, widely adopted domain ontologies such as FOAF, SIOC, Dublic Core, GoodRelations should be mapped to the relational database.* *Integrating Enterprise Relational Databases for tax control:* I think this use case clearly represents Scenario 1. I propose the following text *Trentino is an autonomous region in the north of Italy with a population of 1 million and more than 200 municipalities. Each municipality has data about people, organizations, building etc in their individual relational database. The goal is to integrate heterogenous relational databases and offer the user, a tax agent, an intelligent tool for navigating through the data present in the many different databases. The tool aggregates data and creates a profile for each tax payer. Each user profile shows different type of information, with links to other entities such as the buildings owner, payments made, location of residence, etc.* *Each relational database can be mapped to an ontology that describes the domain in course. Queries can then be executed on the domain ontology and then translated to specific queries on each relational database.* *RNA Database**:* I put this use case in scenario 2. There is existing linked data about proteins, and we would like to expose our rdb data as linked data so others can point to us and because we want to point to others. I propose the following text: *Rob, from the RNA lab would like to integrate the data from his RNA Comparative Analysis Database (rCAD) with other databases such as GenBank, PDB, etc. This other databases are now exposed as RDF on the web following the Linked Data principles. Therefore Rob would also like to expose rCAD as RDF on the web following the Linked Data principles in order to link the data from rCAD with the data in GenBank, PDB, etc. Eventually he will be able to execute SPARQL queries on the web that will return results from different data sources.* *In order to expose rCAD as RDF, Rob wants to map rCAD to an existing domain ontology called the Multiple Alignment Ontology (MAO).* The other two use cases, Exposing many-to-many join tables as simple triples and Value based type specification, I honestly do not see them as use cases, instead as a motivation for requirements. Ok.. so this is my proposal. I honestly do not see how presenting these scenarios and classifying the use cases in these scenarios will affect the how the audience understands the RDB2RDF issue. I believe that with a little bit of structure, this can be more understandable for a wider audience. [1] http://www.w3.org/2010/04/20-rdb2rdf-minutes.html [2] http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Apr/0058.html [3] http://www.w3.org/2001/sw/rdb2rdf/use-cases/#uc
Received on Tuesday, 27 April 2010 00:27:39 UTC