- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Tue, 8 May 2012 21:56:08 +0100
- To: Souripriya Das <souripriya.das@oracle.com>
- Cc: public-rdb2rdf-wg@w3.org
Hi Souri, I'd rather not require the DB to actually *return* the specific row number in the generated RDF. This prevents many optimizations. For example, it means that you always have to do a full table scan even if a usable index is available, because otherwise the returned row number would change depending on the query. (Maybe some DBs can optimize this, I'm not sure.) I've written a different proposal in another email. It requires generating a fresh blank node for each logical table row. It only works for blank nodes, which is all we need to support the DM. It's up to the implementation to decide on the blank node label. Your observation that this feature isn't necessary over R2RML views to support the full DM is a good one. I adapted the proposal accordingly. Best, Richard On 8 May 2012, at 21:22, Souripriya Das wrote: > Let us take the following Wonderland table or view: > [Alice, 10] > [Alice, 10] > [Alice, 20] > [Bob, 20] > > User specifies the following R2RML mapping utilizing a pesudocolumn "rr:rownum": > > <Tmap1> > rr:logicalTable [ rr:tableName "Wonderland" ] > rr:subjectMap [ rr:template "http://Wonderland/my_rownum={rr:rownum}" ] > > In a ROWNUM capable DB, the mapping processor implicitly converts it to the following R2RML mapping > (the actual implementation may vary from DB to DB based upon how the equivalent of ROWNUM can be implemented) > > <Tmap1> > rr:logicalTable [ rr:sqlQuery """ Select ROWNUM AS "rr:rownum", t.* from Wonderland t order by "rr:rownum" """ ] > rr:subjectMap [ rr:template "http://Wonderland/my_rownum={\"rr:rownum\"}" ] > > We can also say that rr:rownum cannot be used when logical table is a SQL query. This case is not relevant for "DM-capability-set MINUS R2RML-capability-set" issue. (Also, user should extend the query himself/herself, not delegate it to R2RML processor.) > > Addition of the rr:rownum pseudocolumn would allow any DM mapping to be expressible as an R2RML mapping. > > Thanks, > - Souri. > > On 5/8/2012 3:02 PM, ashok malhotra wrote: >> Hi David: >> Here is the issue as I understand it. >> If you start with the Relational table >> [Alice, 10] >> [Alice, 10] >> >> And use naive R2RML on it, you would get >> _:1<IOU#BORROWER> "Alice". >> _:1<IOU#AMOUNT> 10. >> _:2<IOU#BORROWER> "Alice". >> _:2<IOU#AMOUNT> 10. >> >> Since Blank node identifiers are generated from column values, there is no way to generate >> different identifiers for _:1 and _:2. Generating the same identifier for them would lose cardinality >> which can be a problem in some cases. >> >> I don't see how to use views to solve this problem. The solutions I see are >> 1. If you are using Oracle or Postgres use RowID or RowNum to generate blank node identifiers. >> 2. Use a join to add a column to the table that contains integers in ascending order. >> (Were you thinking of something like this for your view solution?) >> 3. Use some other mechanism to generate bnode identifiers, perhaps a function. >> 4. Add a facility to R2RML that would add a column like RowID >> >> It is this last solution that folks are considering. >> All the best, Ashok >> >> On 5/8/2012 11:14 AM, David McNeil wrote: >>> One of the key issues from today's working group discussion was whether the Direct Mapping of a table with duplicate rows could be represented in R2RML. The observation was made that it cannot be represented in R2RML. Is this still the case if R2RML Views are considered? It seems to me that in many cases a Direct Mapping of a table with duplicate rows could be represented in R2RML with an R2RML View. I think that is a reasonable way to handle the issue in the case of a custom mapping, perhaps it could be used for the Direct Mapping as well? I suppose this might not be acceptable to some if it relied on vendor specific SQL. >>> >>> -David >> >
Received on Tuesday, 8 May 2012 20:56:51 UTC