Re: Proposal to extend rangeIncludes of DataTypes predicates in schema.org

[I have suggested going the other way, and having the current RDFa master
be generated from some more expressive format.  Some restricted form of
Manchester OWL Syntax or Cyc KE , maybe? ]

1. OWL 2 provides for data type unions, so a data range can consist of
multiple xsd primitive types.

Owl adds two numeric types : owl:real, and owl:rational. owl:real contains
owl:rational, which contains xsd:decimal, as well as the various integer
types.

xsd:double and xsd:float are not reals, but are a separate part of the OWL
2 datatype map.

The schema:Float subtype would seem to cause problems if it is translated
as xsd:float - however it is so underspecified (Full specification:
"floating number") that it could just as well be xsd:decimal.

The latter seems more plausible, but in any event, the value spaces can be
covered with:

DatatypeDefinition(schema:Number
      DataUnionOf (owl:real xsd:float xsd:double))

2. The kluge comes out to play when a schema property has a range that is
the union of a datatype and an object type. The ingesting pre-processor
needs to box the literals, and the box type probably needs a HasKey axiom
or a stronger canonicalization.

Retrieval pre/post processors need to unbox. This can make searches more
expensive (depending on how data is stored and how effective caching is).
Equality is cheap, but inequalities may not be.

An alternative is for a preprocessor to create separate ersatz data
properties; this gets ugly if you want to have cardinality constraints, but
may improve some searches.

Simon
On Nov 12, 2014 6:37 AM, "martin.hepp@ebusiness-unibw.org" <
martin.hepp@ebusiness-unibw.org> wrote:

> Mi Marc:
>
> Thanks. Note that the "subtyping" approach is not without caveats either,
> since schema:Number is a supertype of multiple numerical XSD datatypes.
> Also, we should carefully check the implications for clients consuming
> schema.org instance data in OWL. While I did not yet check wether there
> is a clever work-around for that in OWL 2, we may end up being in OWL Full.
>
> I think the better approach for tackling this problem is the following:
>
> - A short statement in the schema.org documentation at
> http://schema.org/docs/documents.html, e.g. a "datatypes.html", which
> explains the basic mapping to XSD and indicates that XSD datatypes are fine
> to use in RDF-based syntaxes.
>
> - Implementing this in the Google Structured Data Testing Tool and the
> Google/Bing/Yahoo/Yandex production systems.
>
> Such would allow anybody to publish respective RDF data while still
> serving major search engines, without spoiling the documentation or model.
>
> Instead of a "datatypes.html" document, one could even go a but further
> and add a "schema.org in RDF syntaxes" document that explains the use of
> schema.org in RDF-based syntaxes, including the use of typed literals.
>
> So a pull request would ideally be such an HTML file, not the
> rangeIncludes axioms.
>
>
> Best wishes / Mit freundlichen Grüßen
>
> Martin Hepp
>
> -------------------------------------------------------
> martin hepp
> e-business & web science research group
> universitaet der bundeswehr muenchen
>
> e-mail:  martin.hepp@unibw.de
> phone:   +49-(0)89-6004-4217
> fax:     +49-(0)89-6004-4620
> www:     http://www.unibw.de/ebusiness/ (group)
>          http://www.heppnetz.de/ (personal)
> skype:   mfhepp
> twitter: mfhepp
>
> Check out GoodRelations for E-Commerce on the Web of Linked Data!
> =================================================================
> * Project Main Page: http://purl.org/goodrelations/
>
>
>
>
> On 12 Nov 2014, at 12:25, Marc Twagirumukiza <marc.twagirumukiza@agfa.com>
> wrote:
>
> > Hi Martin,
> > Thanks, I acknowledge your position on this.
> > BTW I wand to apologise, I misinterpreted your  previous statement when
> I said  "making the xsd datatypes subtypes of  datatypes schema.org". I
> saw you meant the inverse.
> > Thanks to Jos who pointed out this error.
> >
> > Kind Regards,
> >
> > Marc Twagirumukiza | Agfa HealthCare
> > Senior Clinical Researcher | HE/Advanced Clinical Applications Research
> > T  +32 3444 8188 | M  +32 499 713 300
> >
> > http://www.agfahealthcare.com
> > http://blog.agfahealthcare.com
> > Click on link to read important disclaimer:
> http://www.agfahealthcare.com/maildisclaimer
> >
> >
> >
> > From:        "martin.hepp@ebusiness-unibw.org" <
> martin.hepp@ebusiness-unibw.org>
> > To:        Marc Twagirumukiza/AXPZC/AGFA@AGFA
> > Cc:        sesuncedu@gmail.com, W3C Web Schemas Task Force <
> public-vocabs@w3.org>
> > Date:        12/11/2014 11:59
> > Subject:        Re: Proposal to extend rangeIncludes of DataTypes
> predicates in schema.org
> >
> >
> >
> > Dear Mark:
> >
> > On 12 Nov 2014, at 10:19, Marc Twagirumukiza <
> marc.twagirumukiza@agfa.com> wrote:
> >
> > > Hi Folks,
> > > This is a nice discussion and may certainly raise several other points
> but let's first see if extending the rangeIncludes of some data type may be
> a way forward.
> > > The unique goal here to be "that major search engines tolerate XSD
> dataType information instead of plain strings for schema.org properties
> in RDFa." as Martin discussed.
> > > If we could have a consensus on this I can submit a pull request in
> Git repo.Your position?
> >
> > As I have tried to express, I am against this proposal, because:
> >
> > 1. it will not have the intended effect and
> > 2. it will cause confusion for average Web developers.
> >
> > Simply adding XSD datatypes to rangeIncludes axioms will not guarantee
> that the Google validator, and more so, the Google/Bing/Yahoo/Yandex
> production systems will properly process typed RDFa literals with XSD
> datatypes.
> >
> > For developers, changing "expected type" information from "Number" to
> "Number OR xsd:integer OR xsd:decimal OR xsd:float OR xsd:double" will make
> things worse, not better.
> >
> > Also, schema:Number as a range definition does not match a single XSD
> datatype - you would have combine many.
> >
> > I agree that it would be good if Google/Bing/Yahoo/Yandex tolerated
> XSD-typed literals in RDFa markup, but I think it is bad to broadly
> encourage developers to do so. Currently, the only ones who face the
> problem you describe are sites that want to publish data for
> Google/Bing/Yahoo/Yandex AND implement parts of the W3C Semantic Web
> vision. That is a minority. Of the 750 k sites that Guha mentioned, most of
> them are just publishing for major search engines.
> >
> > Personally, I think that typing literal values at the instance level, as
> in RDF, is a bad idea, and rather a bug than a feature. We should not
> propagate that bug into schema.org.
> >
> > Martin
> >
> > PS: The problem with http://schema.org/Boolean is a different one.
> >
> >
> >
> >
> > On 12 Nov 2014, at 10:19, Marc Twagirumukiza <
> marc.twagirumukiza@agfa.com> wrote:
> >
> > > Hi Folks,
> > > This is a nice discussion and may certainly raise several other points
> but let's first see if extending the rangeIncludes of some data type may be
> a way forward.
> > > The unique goal here to be "that major search engines tolerate XSD
> dataType information instead of plain strings for schema.org properties
> in RDFa." as Martin discussed.
> > > If we could have a consensus on this I can submit a pull request in
> Git repo.Your position?
> > >
> > > Kind Regards,
> > >
> > > Marc Twagirumukiza | Agfa HealthCare
> > > Senior Clinical Researcher | HE/Advanced Clinical Applications Research
> > > T  +32 3444 8188 | M  +32 499 713 300
> > >
> > > http://www.agfahealthcare.com
> > > http://blog.agfahealthcare.com
> > > Click on link to read important disclaimer:
> http://www.agfahealthcare.com/maildisclaimer
> > >
> > >
> > >
> > > From:        Simon Spero <sesuncedu@gmail.com>
> > > To:        martin.hepp@ebusiness-unibw.org
> > > Cc:        W3C Web Schemas Task Force <public-vocabs@w3.org>, Marc
> Twagirumukiza/AXPZC/AGFA@AGFA
> > > Date:        10/11/2014 17:44
> > > Subject:        Re: Proposal to extend rangeIncludes of DataTypes
> predicates in schema.org
> > >
> > >
> > >
> > > BLUF: Booleans are hard. Let's go shopping. [1]
> > > On Nov 10, 2014 5:14 AM, "martin.hepp@ebusiness-unibw.org" <
> martin.hepp@ebusiness-unibw.org> wrote:
> > >
> > > > The only use-case where I see a need for datatype information
> attached to literal values is when the vocabulary allows multiple datatypes
> that the client could not distinguish automatiocally (e.g. think of a
> property that allows a xsd:string and xsd:boolean - then "True" may be a
> string or a boolean value). But that is a rare exception.
> > >
> > > I am pretty sure that there is a property where this is almost
> happens, though the values are "Yes" or "No", not schema:True or
> schema:False.
> > >
> > > And after cheating, I see it's legacy for  schema:acceptsReservations!
> > >
> > > schema:Boolean does not seem to be a true datatype in the way that
> Text is ; it has two instances, schema:True and schema:False. These values
> are described on the schema.org/Boolean page as being "more specific
> types" , but I am pretty sure they are instances (and clicking on the links
> goes to pages that render as instances).
> > >
> > > The place where things might get confused is with Text and URL, since
> these are both literal valued and URL is sub-datatyped from Text.  This can
> occur on eg applicationCategory.
> > >
> > > I believe that in this case a URL will not be recognized as a URL
> unless it is explicitly typed (unless microdata magic applies).
> > >
> > > I expect that the documentation for Boolean could stand to be
> rewritten SMTP style - write the spec to match the implementation. That
> would give a lexical space different from xsd:boolean, and would handle the
> URI forms as constant strings.
> > >
> > > BTW, requiresSubscription has a Boolean range, but does not appear on
> the Boolean page.
> > >
> > > Also, Boolean is not a schema:Enumeration, despite having an
> enumerated set of values.
> > >
> > > Simon
> > >
> > > [1]  http://youtu.be/DzTWF1jVwH4
> > >
> > >
> >
> >
> >
>
>

Received on Wednesday, 12 November 2014 15:21:52 UTC