- From: Joshua Shinavier <joshsh@uber.com>
- Date: Sat, 14 Dec 2019 11:57:10 -0800
- To: Dave Raggett <dsr@w3.org>
- Cc: Juan Sequeda <juanfederico@gmail.com>, Semantic Web <semantic-web@w3.org>
- Message-ID: <CAPc0OutgtAbkxOdS=WaCMZKmR27fydHLo5wfGWtW7gPbNvHxzQ@mail.gmail.com>
Hi Dave, Thanks for the comments. Chiming in inline. On Sat, Dec 14, 2019 at 7:27 AM Dave Raggett <dsr@w3.org> wrote: > [...] > > • How do we create mappings between different data models? > > • Or should we create a dragon data model that rules them all, such that > all data models can be mapped to the dragon data model? If so, what are all > the abstract features that a data model should support? > > This corresponds to the pros and cons of using upper ontologies vs peer to > peer mappings. The answer which is best depends on the context and which > approach proves to be cheaper, more robust etc. > True, and one more step removed, it corresponds to the pros and cons of "hub" datasets like DBpedia, "star" tables in data warehouses, etc. The analogies go on and on. For data models, a star pattern can be valuable for facilitating composability. If you can compose data models in an associative fashion, you can more easily decouple data, queries, and processes from any one model, and carry them across models. In the graph community, we have been developing pairwise mappings between data models for a long time. Unfortunately, it is usually the case that even if you have a mapping between model A and B, and another mapping between B and C, you don't thereby have a mapping between A and C, because the mappings formalize B in different ways. Property graphs are a frequent "B" because there has been no single agreed-upon property graph formalism. > > • What is the formalism to represent mappings? Logic? Algebra? Category > Theory? > > Do we need really need such formalisms? An alternative is to see this as > figuring out how to define mappings between graphs Well, now we're back at the schema or instance level. I would say that no single ontology will save you from defining pairwise mappings, because the set of terms we might want to align is unbounded. Less so for data models, which deal with syntax and the most basic semantics. > based upon the statistics of a set of training examples, e.g. as used by > Google translate to map text in one human language to text in another > language. Rather than manually developing mapping rules, we would instead > focus on curation of examples and counter examples, and scoring mappings on > a scale of good to bad. Is this blend of graph+statistics in scope for the > Semantic Web? > I don't see statistical approaches as to schema or dataset alignment as being at odds with abstractions for data models and mappings. We need both, although in my experience, we need the abstractions *more urgently*, at least in the context of enterprise data integration. For some definition of "we". > > • What are the properties that mappings should have? Information, Query > and Semantics preserving, composability, etc. > > I would emphasis machine learnability! > Again at the schema or instance level, I agree that automated mappings could be extremely useful in some scenarios, saving massive amounts of developer time. Some practical reasons you may see machine-learned mappings less often than human-defined ones are the added expense of data analysis, and the complicatedness of combining data from multiple datasets in a single processing workflow. Josh > > Dave Raggett <dsr@w3.org> > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_People_Raggett&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=zWC35dvWhIiEVhQRZ8yKb2ykDql0Tlyu6ux13Cm2djE&s=fv8_-MZKOQiDl-Ti94HM9nKfMtTyj7OuTGDfirJlWr0&e= > W3C Data Activity Lead & W3C champion for the Web of things > > > > >
Received on Saturday, 14 December 2019 19:57:26 UTC