- From: Harry Halpin <hhalpin@w3.org>
- Date: Sun, 12 Dec 2010 22:05:53 -0000 (GMT)
- To: public-rdb2rdf-wg@w3.org
While I was at it, I also reviewed the Direct Mapping Dcoument. Again, comments are mine only, not W3C's, as Ivan/Eric are now staff contacts. Comments, roughly in order of appearance in document 1) Abstract. I'd mention R2RML as one way to do "refinements" s/more intricate/custom remove "and a formal" as we don't have that agreed upon yet, but add in back in when we do. 2) Why do URIs end in "#_?" There is also a problem with how this works in the IRI construction algorithm, as you can get IRIs "baseIRIR/table_name#column_name#_". That I'm pretty sure is not a legal URI, that's a fragid of a fragid. See "Single-column IRI" in 2.2. 3) If we choose to keep the ending with # (thus making it an OK URI for RDF by W3C TAG and in common RDF style), why is there an extra "_" underscore added? While these URIs could resolve to text/html, that's unlikely at best. The extra '_' only makes things more confusing, as that's not common practice in RDF. 4) Issue (has -vs-slash): While I hope we can preserve ending with "#" and thus put something else between table_names and column_names (maybe put the '_' there instead?), let's keep it "example.org/ex#" rather than "example.org/ex#_" 5) Issue primary-is-candidate-key: Given that we should keep this algorithm simple and direct, and the proposed changes try to guess some things about the structure of the intended RDF, I think we should just ignore the fact that a primary key is also a candidate key and run the algorithm over the tables as normal. Then, if people want to do more complex modelling, they can then use RIF or whatever on top of the resulting RDF. 6) Issue hier-table-at-risk. See 5). 7) Issue fk-pk-order. Not sure how this should be handled, but my temptation would be to say see 5) 8) Issue many-to-many-as-repeated properties. Again, see 5) 9) Issue formalism-model: This has been quite the debate, and I think we should de-link the semantics from both Direct Mapping and R2RML, and put them in a separate document. Second, I think a stringent requirement on any formalism should be able of handing both Direct Mapping and R2RML, as otherwise we have the situation where possibly incompatible semantics model two different docs. I haven't seen a candidate that does scale to both R2RML and direct mapping. More on motherhood and apple-pie, but formal semantics in general has to involve an interpretation function from some formal definition of syntax (usually done with a BNF) to a mathematical structure in Tarski-style semantics (and so constrains infererence) or directly specifies allowed inferences in proof-theoretic (Gentzen) style semantics. On a high level, it appears both Section 3 and Section 5 are doing the same thing, and until I see test-cases that show otherwise, I think they're basically compatible...whether one likes the functional way z=f(x,y) or one prefers the rule way f(x,y,z) doesn't really matter. What it appears that we have in Section 3 is well-defined BNF (i.e. how people usually define syntax precisely) where the production rules involve variables whose are derived from the table. While it definitely completely specifies the problem by virtue of directly specifying what one should code via a set-theoretic take on some working Scala code, we should not tell direct implementers on that level of detail. However, in Section 4 we have some rules in what people would think is first-order logic (a variant thereof, Datalog). However, without direct reference to the R2RL semantics, it's not a formal semantics [1]. Therefore, these should be in the same document. Second, functions like genreateColumnIRI and whatnot should be specified on a more low level, i.e. give a direct rule for "baseIRI+blah blah+#" in form of a string concat construction. So, something that involves more precise instructions that Section 4 but not as precise as Section 3 is what is needed. I think that the best reason for a formal semantics should be to help implementers, and giving the implementers a finite list of functions or rules they have to check they implement is a good way to do it. These descriptions should be precise when necessary, but also allow variation in coding and be usable across different programming paradigms. 10) We should provide some more guidance on what to do with the Direct Mapping once you've produced a bunch of RDF from some tables directly. The general story people were telling was to use RIF/some RDF rule language/SPARQL to transform them. I think we should tell people about this option in some way, but leave the specifics to another document (perhaps a Working Group note with some examples). [1] http://www.w3.org/2001/sw/rdb2rdf/wiki/Semantics_of_R2RML
Received on Sunday, 12 December 2010 22:05:55 UTC