Re: review request for http://www.w3.org/2001/sw/rdb2rdf/use-cases/ from Harry Halpin on 2010-05-11 (public-rdb2rdf-wg@w3.org from May 2010)

From: Harry Halpin <hhalpin@w3.org>
Date: Tue, 11 May 2010 16:51:46 +0100 (BST)
To: public-rdb2rdf-wg@w3.org
Message-ID: <1071ab74a46b93b9bc3578eb9ffe195d.squirrel@webmail-mit.w3.org>
Here's a very long of minor grammatical changes and some clarifying points
with suggested text. The editors are welcome to take these points or leave
them, but I think taking them on board would lead to a better use-case
document.

The general form is "text string in current doc -> new improved text string"
with occasional clarifying notes proceeded by a CAPITALIZED word and then
sometimes minor changes having asteriks around them in "text string". The
point is for the editor to be able to easily s/oldstring/newstring in
their favorite text editor for XMLSpec.

Minor edits:
- Enterprise -> enterprise

- requirements for a relational to RDF mapping with ->  requirements for
mapping relational data to RDF with

- beforementioned use cases -> aforementioned use cases

- The Web of Data is constantly growing due to its compelling potential of
facilitating data integration and retrieval. -> The Web of Data deployes
RDF to expose structured and hetereogenous data on the Web, as RDF has the
compelling potential to facilitate open-ended data integration and
retrieval.
NOTE: We need to clarify the use of term "Web of Data" and "Semantic Web".
I'd just say, use "Web of Data" and avoid using term "Semantic Web" if
possible. Since we are taking on Google's point that while Linked Data is
growing, it's still fairly small compared to the Web, and also taking on
Microsoft's point that the Web of Data does not necessarily involve RDF,
so trying to be clear about what we are doing here.

- DELETE "since they follow an Open World Assumption".
NOTE: Not sure if that's true, as it depends on what means by "powerful",
especially re formal expressivity of SPARQL and SQL :) I'd say "more
useful" as opposed to "powerful" if you want to keep reference to Open
World Assumption.

- Data in the Web should be defined -> RDF data on the Web should be defined

- "for examples in Extract, transform, and load (ETL) processes" -> "for
example in Extract, Transform, and Load (ETL) processes"

- of proposals how to tackle  -> of proposals *on* how to tackle


- For example, imagine that a database administrator is working on
exposing weather data as Linked Data to be consumed by other applications.
At first, this weather data is stored in a light-weight database (such as
MySQL).
-> For example, imagine that a database administrator is exposing weather
data as Linked Data to be consumed by other applications. At first, this
weather data is stored in a light-weight database (such as MySQL).
DELETE "is working on"

-  Another motivation for a standard is that for certain classes of
systems (such as CMS) a 'default' mapping could be defined which can be
deployed no matter what underlying RDB is used
NOTE: addresses Editorial note.


-  Another motivation for a standard is that for certain classes of
systems (such as CMS) a 'default' mapping could be defined which can be
deployed no matter what underlying RDB is used.
ADD As these systems, such as @@X, can some times be run on top of
different underlying relational databases, a standardized way of mapping
between relational data and RDF allows the underlying database to be
changed (say from @@Y to @@Z) without disturbing the content management
system.
NOTE:  addresses Editorial note, but What exact content management systems
allow this? Drupal? To my knowledge very few CMS systems outside Drupal
ues RDF in any substantial way.

-other RDB, XLS, CSV, etc. -> as other relational databases, spreadsheets,
CSV files

- (HTML, PDF, etc) -> RDF dervied automatically or semi-automatically from
the text in HTML, PDF, feeds, etc.)

- (HTML, PDF, etc) -> (HTML, PDF, etc.)
NOTE: Please s/etc/etc. throughout document

-(structured, rdf, unstructured) -> (structured, *RDF*, unstructured)

2.1 UC1-Patient Recuritment

- "While there are many motivations for providing a common interface to
administratively distinct databases (access to patient history, shared
rules for clinical decision support, etc), in this case, SPARQL queries
(following the table description) were used to find candidates for
clinical studies." should be first sentence.

- ADD SENTENCE EXPLAINING WHAT PATIENT RECRUITMENT IS

- each table is are two RDF views  -> each table are two RDF views
DELETE "is"

- data structures -> data structures.
ADD PERIOD

- Why blank middle name in 2.1.1?

- DELETE "The RDF graphs place the relational data into the Semantic Web.
There are many ways to consume RDF data, integration with other data
sources, inference according to OWL or RIF rules, browsing with a linked
data browser like Bubbles or Tabulator." REDUNDANT.

-materialised -> materialized
NOTE: DECIDE ON AMERICAN OR BRITISH ENGLISH

2.2 UC2

- s/Semantic Web/Web of Data

- (e.g. Wikis, Blogs, Fora) -> (wikis, blogs, forums)

- will facilitate broad penetration and enrich the Web with RDF data and
ontologies and facilitate novel -> will  facilitate novel

- DELETE "will facilitate broad penetration and enrich the Web with RDF
data and ontologies and" since verb 'facilitate' is used twice and not
sure enriching the Web by itself with RDF and OWL is really a use-case, as
use-cases should be technology independent.

- Web 2.0 applications the-> Web 2.0 applications, the
ADD COMMA

- To support this usecase scenario, the mapping language -> To support
this use case, the mapping language
DELETE scenario

-REMOVE BOLD a shallow learning curve to foster early adoption by Web
developers.
NOTE: We don't use it anywhere else.

- Wordpress_27_schema.png IS TOO SMALL TO READ

-post, attachment, tag, category, user and comment -> ADD <CODE> TAGS TO
post, attachment, tag, category. An example instance of the post class,->
of the <CODE>post</CODE> class

- Let's use same background color as in UC1 in UC2, and remove hyperlinks
in turtle code.

2.3 UC3 - Integrating Enterprise Relational databse

- NOTE: This use-case eems to to have a strange structure, let's normalize
it.

- DELETE "Responsible: Angela Fogarolli"

- DELETE "Goal:" and s/PROBLEM:/The re-use of unique identifiers allows:

- Integrating relational databases and exposing them on the web or
intranet based on the final RDB2RDF XG 1.1.3 and 1.1.2 use cases ) through
the use of unique identifiers. This approach consist of integrating  and
interlinking data about entities on different databases. ->

- Integrating relational databases and exposing them on the web or
intranet requires the re-use of unique identifiers in order to integrate 
and interlink data about entities on different databases.

- Join between -> Joins between

- Join structured data (SQL) to structured data, from incompatible schema
-> Join structured relational data  to structured data from incompatible
schema

REMOVE "Requirements:" and the three bullet points below it, those have
already been talked about.

REMOVE "Use Case Description:" heading

- People and -> people and

-their on information systems ->  their own information systems

- buildings etc. ->  buildings, etc.

- REMOVE , and other sources.
NOTE: Not sure what that means

- With our methodology we will provide -> The re-use of unique identifiers
will provide

- In this way we are  providing the user, a tax agent in our case an
intelligent tool for navigating through the data present in the many
different databases. The tool aggregates data and creates a profile for
each tax payer-> In this way we are providing a tax agent an intelligent
tool for navigating through the data present in the many different
databases. Using unique identifiers, a tool can aggregate data and creates
a profile for each tax payer

FIX COMMA SPACING Each user profile shows different type of information ,
with links to other entities such as the buildings owned , payments made ,
location of residence etc.->Each user profile shows different type of
information, with links to other entities such as the buildings owned,
payments made, location of residence, and so on.

- (Anagrafe and Urban_Cadastre)->(<code>Anagrafe</code> and
<code>Urban_Cadastre</code>)

- includs the information-> includes the information

- (person’s residence place) -> (*a* person's residence place)

-other information which are not properties of persons or locations->
other information.
DELETE EXCESSIVE which..

-Put DDL in a normal file, not picture, and then link offline.->Using
RDB2RDF translation methods without unique identifiers, the RDF
representation for the two example tables coming from two different
databases is shown below:

-Please convert the RDF/XML code to N-triples

-In SQL, there is no way to create a query which joins data of two tables
coming from different databases. For solving the identity problem which is
required by the use case creating a RDF representation is not enough. The
use case demands the use of unique identifier to refer to entities in
order to join descriptions about the same entity coming from different
datasources. ->If we wanted to query these two tables, we would have to
create a unique identifier - such as
<code>http://www.example.org/Paolo_Bouquet</code>  for Paolo Bouquet - to
refer to entities in order to join descriptions about the same entity
coming from different data sources.

2.4 UC4

-DELETE hyperlinked "rCAD - RNA Comparative Analysis using SQLServer:"

-Put hyperlink here "implemented the RNA Comparative Analysis Database -rCAD"

-DELETE " The rCAD consists of different schema: Sequence Metadata,
Evolutionary Relationships, Structural Relationships and Sequence
Alignment."
NOTESentence seems odd and out of place.

Rest of UC4 and document minor edits to be sent after telecon...
Received on Tuesday, 11 May 2010 15:51:48 UTC