- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Mon, 19 Apr 2010 18:02:41 -0400
- To: Daniel Miranker <miranker@cs.utexas.edu>
- Cc: Juan Sequeda <juanfederico@gmail.com>, Michael Hausenblas <michael.hausenblas@deri.org>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
* Daniel Miranker <miranker@cs.utexas.edu> [2010-04-19 15:21-0500] > > > > On Apr 19, 2010, at 12:21 PM, Eric Prud'hommeaux wrote: > > >* Juan Sequeda <juanfederico@gmail.com> [2010-04-19 10:48-0500] > >>Michael > >> > >>I just updated [1] again. I moved the section Expressivity, > >>written by Eric > >>to [2] . In the spreadsheet, we have the following taxonomy per > >>Expressivity: > >> > >> > >> 1. Expressivity > >> - Node Label Generation: Graph node names are synthesized > >>from a > >> function of database attributes > >> - Datatype expressions > >> - Simple > >> - Relational data (cells) are mapped to rdf datatypes > >>per SQL XSD > >> mapping. > >> - Micorparsing: Relational data are parsed and mapped to rdf > >> graphs. > >> > >> > >>But now that we dig into it I think it is redundant with what is > >>already in > >>[1] . In particular, > >> > >>“b Node Label Generation” appears to be the same as the role of > >>the ontology > >>being putative, > > > >I don't see the connection. If I am exporting a human resources > >database in say FOAF and vcal, I am likely to generate resources based > >on some function of attribute names: > > http://myco.example/Employee?id=218 foaf:givenName "Bob" . > >Likewise, if I accept sort of a default ontology from the database > >structure, I may want to do the same: > > http://myco.example/Employee?id=218 Employee:fname "Bob" . > > > >The mapping language could be much simpler if it did not handle graph > >transformations (simple mapping of attributes to predicates), but I > >know that several use cases are not met by that. > > > I just looked up the (or at least a) FOAF Spec. > > http://xmlns.com/foaf/spec/ > "This document is created by combining the RDFS/OWL machine-readable > FOAF ontology" > > So if you are exporting a data to FOAF then per the taxonomy you are > mapping relational data wrt an existing domain ontology. > > I won't be surprised if there is confusion/ambiguity about this. > Recall when I first emerged with the taxonomy I qualified that I was > more interested in making > sure we mean the same thing we are talking with each other than any > proprietary interest I have in the taxonomy. I believe I provided a counter example to the assertion that functional label generation was captured by the "putative ontology". FOAF is a pre-existing ontology, but I still may want to label the nodes in the generated graph. > >>“c Data type expression” and its three cases appear to be the same as > >>classifying the treatment of relational data sources at the > >>start of the > >>document > > > >Sorry, I'm not following this. Could you give examples of the > >redundancy? > > I have written > > i. Structured > > Consider only highly structured database content. String and other > text fields are not considered valuable. > > ii. Structured + Semistructured > > Text fields are considered valued but are treated simply as unparsed > strings. > > iii. Structured + Microparsed Tagged Text > > Text fields in the database are parsed into an RDF graph per an > existing ontology. > > > > > > You have written > > >Datatype expressions > > - Simple > > - Relational data (cells) are mapped to rdf datatypes per > >SQL XSD > > mapping. > > - Micorparsing: Relational data are parsed and mapped to rdf > > graphs. > > > > We may need to hash out/refine the subcategories and their titles, > but, at the level of detail we are at this time I'm thinking > > Simple == Structured > Realational Data... per SQL XSD Mapping == Structured + Semistructured > Microparsing == Structured + Microparsed Tagged Data Given a row in a protein database with a primary key attribute "ID" and another unique attribute "uniProt": | ID | uniProt | name | seqLength | | 18 | 68250 | "YYHAB" | "246 AA" | I would like to ask the world which of the following subject mappings they need: <http://mydb.example/prots/ID=18> db:name "YYHAB" . <http://mydb.example/prots18/more/path> db:name "YYHAB" . <http://www.uniprot.org/uniprot/P68250> db:name "YYHAB" . The former uses a potentially hard-coded formula, the middle uses a user-supplied function of the primary key and the latter uses a function of a different attribute to produce a common proteomic node label. The expression of the SQL String "YYHAB" is, in the above examples, expressed directly as an RDF Plain Literal (perhaps the SQL draft suggests "YYHAB"^^xsd:string, I don't recall). Expressing the seqLength would be an opportunity for micro-parsing as it encodes both the length (and integer) and the metric. I haven't found good examples of need for user-defined datatypes. Perhaps there are some oddball types in SQL that don't have a defined XSD representation. I guess any blob that could be micro-parsed could instead be given a special type. > Possibly a difference in our thinking is that you may be looking at > row content as a row in a CSV file, divorced from > the column names and SQL data types; thus the entire content of the > row depends on parsing. > > This is why, in part, my category names are Structured + something. > > > I'm also wondering if your designations of expressivity belong in > the requirement section on the language. > Note I've broken up the requirements into two parts, 1) those > mechanical requrements on the language, e.g. its connections to RIF, > and requriements on syntactic convention, In other words > requirements of languages that come from the Semantic Web community > 2) The requirements that come from the applications/end users. > > > > > > > > > >>• We have updated that section to include Eric’s requirement that > >>microparsing produce an RDF graph. > > > >Do we have use cases supporting that? I merely meant to point out that > >it was an option, but I don't think anyone has asked for it yet. > > > No we don't have any use cases. However, an entire half of my > application facing life > is with systematic biologists. They have so many databases of > tables of 4 or 5 columns of structured > data, with another 3 or 4 columns of text field, it is painful. > Even something like the geographic > location where a specimen was collected will usually be in a text > field that could contain anything > from a lat/long to "50 feet in front of Tom Miller Dam in Austin", > and everything in between. > > > > > > > > >>• Similarly, we have penciled in the rdf datatype mapping in > >>Section 2. > >> > >>This is just to let everybody know what happened to this part. > >> > >>We can discuss this tomorrow. > >> > >>In conclusion, [1] is ready (even though it still needs to be > >>expanded) > >> > >>[1] http://www.w3.org/2001/sw/rdb2rdf/wiki/Use_Cases_and_Requirements > >>[2] http://www.w3.org/2001/sw/rdb2rdf/wiki/Draft_of_Use_Cases > >><http://www.w3.org/2001/sw/rdb2rdf/wiki/Use_Cases_and_Requirements> > >>Juan Sequeda > >>+1-575-SEQ-UEDA > >>www.juansequeda.com > >> > >> > >>On Mon, Apr 19, 2010 at 10:32 AM, Michael Hausenblas < > >>michael.hausenblas@deri.org> wrote: > >> > >>> > >>>Great work, Juan! > >>> > >>>We (Eric and I) take over for now (consider the Wiki stable > >>>for the moment) > >>>in order to compile a version for tomorrow's meeting at [1]. > >>> > >>> > >>>Cheers, > >>> Michael > >>> > >>>[1] http://www.w3.org/2001/sw/rdb2rdf/use-cases/ > >>> > >>>-- > >>>Dr. Michael Hausenblas > >>>LiDRC - Linked Data Research Centre > >>>DERI - Digital Enterprise Research Institute > >>>NUIG - National University of Ireland, Galway > >>>Ireland, Europe > >>>Tel. +353 91 495730 > >>>http://linkeddata.deri.ie/ > >>>http://sw-app.org/about.html > >>> > >>> > >>> > >>>>From: Juan Sequeda <juanfederico@gmail.com> > >>>>Date: Mon, 19 Apr 2010 10:29:17 -0500 > >>>>To: RDB2RDF WG <public-rdb2rdf-wg@w3.org> > >>>>Subject: Draft of Use Case document on Wiki (as promised) > >>>>Resent-From: RDB2RDF WG <public-rdb2rdf-wg@w3.org> > >>>>Resent-Date: Mon, 19 Apr 2010 15:29:52 +0000 > >>>> > >>>>Hi Everybody > >>>> > >>>>You can find an updated version of the Use Case document > >>>>here [1]. This > >>>is > >>>>the original page that we have been adding all the use > >>>>cases. I created a > >>>>Draft of Use Cases page here [2]. > >>>> > >>>>Therefore, we should be focusing on [1]. I still need to add > >>>>some of the > >>>>UML, DDL, etc. > >>>> > >>>>I'm following the example that Michael once gave out from > >>>>the RDFa Use > >>>Case > >>>>document [3] where they originally show HTML (before) and then show > >>>>HTML+RDFa (after). I think this is something that we should > >>>>do in this > >>>>document. However, I guess this may be up for discussion. > >>>> > >>>>Looking forward to your comments > >>>> > >>>>[1] http://www.w3.org/2001/sw/rdb2rdf/wiki/ > >>>>Use_Cases_and_Requirements > >>>>[2] http://www.w3.org/2001/sw/rdb2rdf/wiki/Draft_of_Use_Cases > >>>><http://www.w3.org/2001/sw/rdb2rdf/wiki/ > >>>>Use_Cases_and_Requirements>[3] > >>>>http://www.w3.org/TR/xhtml-rdfa-scenarios/#use-case-1 > >>>> > >>>>Juan Sequeda > >>>>+1-575-SEQ-UEDA > >>>>www.juansequeda.com > >>> > >>> > > > >-- > >-ericP > -- -ericP
Received on Monday, 19 April 2010 22:03:17 UTC