Re: Proposal to extend rangeIncludes of DataTypes predicates in schema.org

Thansk for the details regarding OWL 2, appreciated!

But schema.org is only very, very loosely related to OWL and OWL 2; its underlying meta-model is not based on OWL and the main consumers of schema.org data are not processing the data in an OWL environment.


Martin

On 12 Nov 2014, at 16:21, Simon Spero <sesuncedu@gmail.com> wrote:

> [I have suggested going the other way, and having the current RDFa master be generated from some more expressive format.  Some restricted form of Manchester OWL Syntax or Cyc KE , maybe? ]
> 
> 1. OWL 2 provides for data type unions, so a data range can consist of multiple xsd primitive types.
> 
> Owl adds two numeric types : owl:real, and owl:rational. owl:real contains owl:rational, which contains xsd:decimal, as well as the various integer types.
> 
> xsd:double and xsd:float are not reals, but are a separate part of the OWL 2 datatype map.
> 
> The schema:Float subtype would seem to cause problems if it is translated as xsd:float - however it is so underspecified (Full specification: "floating number") that it could just as well be xsd:decimal.
> 
> The latter seems more plausible, but in any event, the value spaces can be covered with:
> 
> DatatypeDefinition(schema:Number
>       DataUnionOf (owl:real xsd:float xsd:double))
> 
> 2. The kluge comes out to play when a schema property has a range that is the union of a datatype and an object type. The ingesting pre-processor needs to box the literals, and the box type probably needs a HasKey axiom or a stronger canonicalization.
> 
> Retrieval pre/post processors need to unbox. This can make searches more expensive (depending on how data is stored and how effective caching is). Equality is cheap, but inequalities may not be.
> 
> An alternative is for a preprocessor to create separate ersatz data properties; this gets ugly if you want to have cardinality constraints, but may improve some searches.
> 
> Simon
> 
> On Nov 12, 2014 6:37 AM, "martin.hepp@ebusiness-unibw.org" <martin.hepp@ebusiness-unibw.org> wrote:
> Mi Marc:
> 
> Thanks. Note that the "subtyping" approach is not without caveats either, since schema:Number is a supertype of multiple numerical XSD datatypes. Also, we should carefully check the implications for clients consuming schema.org instance data in OWL. While I did not yet check wether there is a clever work-around for that in OWL 2, we may end up being in OWL Full.
> 
> I think the better approach for tackling this problem is the following:
> 
> - A short statement in the schema.org documentation at http://schema.org/docs/documents.html, e.g. a "datatypes.html", which explains the basic mapping to XSD and indicates that XSD datatypes are fine to use in RDF-based syntaxes.
> 
> - Implementing this in the Google Structured Data Testing Tool and the Google/Bing/Yahoo/Yandex production systems.
> 
> Such would allow anybody to publish respective RDF data while still serving major search engines, without spoiling the documentation or model.
> 
> Instead of a "datatypes.html" document, one could even go a but further and add a "schema.org in RDF syntaxes" document that explains the use of schema.org in RDF-based syntaxes, including the use of typed literals.
> 
> So a pull request would ideally be such an HTML file, not the rangeIncludes axioms.
> 
> 
> Best wishes / Mit freundlichen Grüßen
> 
> Martin Hepp
> 
> -------------------------------------------------------
> martin hepp
> e-business & web science research group
> universitaet der bundeswehr muenchen
> 
> e-mail:  martin.hepp@unibw.de
> phone:   +49-(0)89-6004-4217
> fax:     +49-(0)89-6004-4620
> www:     http://www.unibw.de/ebusiness/ (group)
>          http://www.heppnetz.de/ (personal)
> skype:   mfhepp
> twitter: mfhepp
> 
> Check out GoodRelations for E-Commerce on the Web of Linked Data!
> =================================================================
> * Project Main Page: http://purl.org/goodrelations/
> 
> 
> 
> 
> On 12 Nov 2014, at 12:25, Marc Twagirumukiza <marc.twagirumukiza@agfa.com> wrote:
> 
> > Hi Martin,
> > Thanks, I acknowledge your position on this.
> > BTW I wand to apologise, I misinterpreted your  previous statement when I said  "making the xsd datatypes subtypes of  datatypes schema.org". I saw you meant the inverse.
> > Thanks to Jos who pointed out this error.
> >
> > Kind Regards,
> >
> > Marc Twagirumukiza | Agfa HealthCare
> > Senior Clinical Researcher | HE/Advanced Clinical Applications Research
> > T  +32 3444 8188 | M  +32 499 713 300
> >
> > http://www.agfahealthcare.com
> > http://blog.agfahealthcare.com
> > Click on link to read important disclaimer: http://www.agfahealthcare.com/maildisclaimer
> >
> >
> >
> > From:        "martin.hepp@ebusiness-unibw.org" <martin.hepp@ebusiness-unibw.org>
> > To:        Marc Twagirumukiza/AXPZC/AGFA@AGFA
> > Cc:        sesuncedu@gmail.com, W3C Web Schemas Task Force <public-vocabs@w3.org>
> > Date:        12/11/2014 11:59
> > Subject:        Re: Proposal to extend rangeIncludes of DataTypes predicates in schema.org
> >
> >
> >
> > Dear Mark:
> >
> > On 12 Nov 2014, at 10:19, Marc Twagirumukiza <marc.twagirumukiza@agfa.com> wrote:
> >
> > > Hi Folks,
> > > This is a nice discussion and may certainly raise several other points but let's first see if extending the rangeIncludes of some data type may be a way forward.
> > > The unique goal here to be "that major search engines tolerate XSD dataType information instead of plain strings for schema.org properties in RDFa." as Martin discussed.
> > > If we could have a consensus on this I can submit a pull request in Git repo.Your position?
> >
> > As I have tried to express, I am against this proposal, because:
> >
> > 1. it will not have the intended effect and
> > 2. it will cause confusion for average Web developers.
> >
> > Simply adding XSD datatypes to rangeIncludes axioms will not guarantee that the Google validator, and more so, the Google/Bing/Yahoo/Yandex production systems will properly process typed RDFa literals with XSD datatypes.
> >
> > For developers, changing "expected type" information from "Number" to "Number OR xsd:integer OR xsd:decimal OR xsd:float OR xsd:double" will make things worse, not better.
> >
> > Also, schema:Number as a range definition does not match a single XSD datatype - you would have combine many.
> >
> > I agree that it would be good if Google/Bing/Yahoo/Yandex tolerated XSD-typed literals in RDFa markup, but I think it is bad to broadly encourage developers to do so. Currently, the only ones who face the problem you describe are sites that want to publish data for Google/Bing/Yahoo/Yandex AND implement parts of the W3C Semantic Web vision. That is a minority. Of the 750 k sites that Guha mentioned, most of them are just publishing for major search engines.
> >
> > Personally, I think that typing literal values at the instance level, as in RDF, is a bad idea, and rather a bug than a feature. We should not propagate that bug into schema.org.
> >
> > Martin
> >
> > PS: The problem with http://schema.org/Boolean is a different one.
> >
> >
> >
> >
> > On 12 Nov 2014, at 10:19, Marc Twagirumukiza <marc.twagirumukiza@agfa.com> wrote:
> >
> > > Hi Folks,
> > > This is a nice discussion and may certainly raise several other points but let's first see if extending the rangeIncludes of some data type may be a way forward.
> > > The unique goal here to be "that major search engines tolerate XSD dataType information instead of plain strings for schema.org properties in RDFa." as Martin discussed.
> > > If we could have a consensus on this I can submit a pull request in Git repo.Your position?
> > >
> > > Kind Regards,
> > >
> > > Marc Twagirumukiza | Agfa HealthCare
> > > Senior Clinical Researcher | HE/Advanced Clinical Applications Research
> > > T  +32 3444 8188 | M  +32 499 713 300
> > >
> > > http://www.agfahealthcare.com
> > > http://blog.agfahealthcare.com
> > > Click on link to read important disclaimer: http://www.agfahealthcare.com/maildisclaimer
> > >
> > >
> > >
> > > From:        Simon Spero <sesuncedu@gmail.com>
> > > To:        martin.hepp@ebusiness-unibw.org
> > > Cc:        W3C Web Schemas Task Force <public-vocabs@w3.org>, Marc Twagirumukiza/AXPZC/AGFA@AGFA
> > > Date:        10/11/2014 17:44
> > > Subject:        Re: Proposal to extend rangeIncludes of DataTypes predicates in schema.org
> > >
> > >
> > >
> > > BLUF: Booleans are hard. Let's go shopping. [1]
> > > On Nov 10, 2014 5:14 AM, "martin.hepp@ebusiness-unibw.org" <martin.hepp@ebusiness-unibw.org> wrote:
> > >
> > > > The only use-case where I see a need for datatype information attached to literal values is when the vocabulary allows multiple datatypes that the client could not distinguish automatiocally (e.g. think of a property that allows a xsd:string and xsd:boolean - then "True" may be a string or a boolean value). But that is a rare exception.
> > >
> > > I am pretty sure that there is a property where this is almost happens, though the values are "Yes" or "No", not schema:True or schema:False.
> > >
> > > And after cheating, I see it's legacy for  schema:acceptsReservations!
> > >
> > > schema:Boolean does not seem to be a true datatype in the way that Text is ; it has two instances, schema:True and schema:False. These values are described on the schema.org/Boolean page as being "more specific types" , but I am pretty sure they are instances (and clicking on the links goes to pages that render as instances).
> > >
> > > The place where things might get confused is with Text and URL, since these are both literal valued and URL is sub-datatyped from Text.  This can occur on eg applicationCategory.
> > >
> > > I believe that in this case a URL will not be recognized as a URL unless it is explicitly typed (unless microdata magic applies).
> > >
> > > I expect that the documentation for Boolean could stand to be rewritten SMTP style - write the spec to match the implementation. That would give a lexical space different from xsd:boolean, and would handle the URI forms as constant strings.
> > >
> > > BTW, requiresSubscription has a Boolean range, but does not appear on the Boolean page.
> > >
> > > Also, Boolean is not a schema:Enumeration, despite having an enumerated set of values.
> > >
> > > Simon
> > >
> > > [1]  http://youtu.be/DzTWF1jVwH4
> > >
> > >
> >
> >
> >
> 

Received on Wednesday, 12 November 2014 15:35:50 UTC